透過您的圖書館登入
IP:216.73.216.116
  • 學位論文

運用資料探勘與機器學習分析學生生源、學業成績及畢業流向之校務研究

Institutional Research using Data Mining and Machine Learning to Analyze Sources of Students, Learning Achievement, and Employment Status

指導教授 : 鄒慶士

摘要


過往的校務研究文獻大多探討單個議題,本研究串連學生的來源、在學表現與畢業流向等多個表格,以瞭解從入學到就業(Born To Be Employed)之間是否存在關聯。研究方法為資料探勘與機器學習(Data Mining and Machine Learning, DMML),研究過程包含資料合併、資料前處理、模型建立、結果釋義等步驟,手法結合資料清理(Cleaning)、資料去識別化(De-Identification) 、統計檢定(Statistical Tests)、資料視覺化(Visualization)、機器學習建模(Machine Learning Modeling)等。運用開源的大數據AI工具 R進行分析與實踐,並將過程開發成單機版網頁應用程式(Web Applications),期望校內職員不一定需要具備程式背景也能輕鬆運用與分析。透過預先建立的隨機森林模型,預測學生未來可能的薪水區間,有助於相關人員瞭解學生的學習情形與追蹤畢業後的就業發展,並通盤掌握學校的教育成果。研究成果將回饋給校內教學及行政單位,以規劃制定相應的校務發展策略。

並列摘要


The majority of earlier institutional research literature addressed individual problems. In order to investigate whether there is a relationship between starting school and born to be employed, this study connects multiple datasets, such as the Sources of Students, Learning Achievement, and Employment Status. The steps of the study approach, which includes Data Mining and Machine Learning (DMML), include data merging, data preprocessing, modelling, and result interpretation. Data de-identification, data cleaning, statistical testing, data visualization, and machine learning modeling are some of the techniques used. Open-source big data AI technologies such as R is used for analysis and practical implementation. The process is then developed into stand-alone web applications. The goal of this technique is to make data utilization and analysis simple for school staff members, irrespective of their programming experience. Using a pre-established Random Forest model to predict the potential future salary range. This helps professionals make sense of students' learning environments and monitor their post-graduation career development. This thorough understanding of the school's educational goals facilitates the planning and creation of appropriate institutional growth methods.

參考文獻


Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25, 197-227.
Bland, J. M., & Altman, D. G. (2000). The odds ratio. Bmj, 320(7247), 1468.
Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
Cardona, T. A. (2019). Predicting student retention using support vector machines. Procedia Manufacturing, 39, 1827-1833.
Hardman, J., Paucar‐Caceres, A., & Fielding, A. (2013). Predicting students' progression in higher education by using the random forest algorithm. Systems Research and Behavioral Science, 30(2), 194-203.

延伸閱讀