透過您的圖書館登入
IP:3.147.73.35
  • 期刊

以支援向量機處理題型符號與文字特徵應用於微積分試題難度分類

Difficulty Level Classification of Calculus Exam Questions Using SVMs with Descriptive Features of Symbols and Texts

摘要


本研究主要在建立微積分試題的「題型符號與文字特徵」,透過人工歸納,對各類型的微積分試題擷取試題符號特徵,並轉換為向量表示。接著,對試題特徵向量,分別以主成分分析(Principal Component Analysis,PCA)及線性判別分析(Linear Discriminant Analysis, LDA)做降維處裡,找出較符合試題難易分布的特徵空間,最後利用支援向量機對降維後的試題特徵,估計試題的難易度,透過使用支援向量機RBF核函數進行「難、中、易」之試題分類。就文獻探討所知,本研究所提出的「題型符號與文字特徵」計算表示形式,為國內外相關研究中創新的特徵集設計。實驗結果顯示:在5摺交叉驗證測試下,對單一摺測試集之微積分試題難易度分類,最高可獲取95%的正確率,而5摺的平均測試正確率也可達90.19%,基於實驗測試結果遠高於隨機亂猜的33.33%,而對3個類別中,隨機亂猜的95%信賴區間上限約在42.69%,可看出本研究方法的實驗結果大幅高於亂猜達47.5%,顯示本研究所提出的「題型符號與文字特徵」對於微積分試題難易度分類具有顯著的功效。

並列摘要


A new design of "descriptive symbol and text features" of calculus exam questions has been proposed in this paper. The proposed descriptive features of symbols and texts can be extracted from various calculus questions and are represented by vectors. The high dimensionality of extracted features from test questions is then reduced by principal component analysis (PCA) or by linear discriminant analysis (LDA) for finding a lower dimensional feature space that better fits the difficulty-level distribution of test questions. Subsequently, a support vector machine with radial basis kernel is adopted to categorize calculus questions into three degrees of difficulty, i.e., hard, medium and easy. To the best of our knowledge, the proposed descriptive feature representation of symbols and texts of mathematical questions is a novel design for difficulty level estimation of calculus exam questions and is rarely seen in previous literature. In our experiments of difficulty level classification with 5-fold cross validation (CV), the highest classification accuracy of difficulty level of calculus questions in a test fold is 95%, while the average classification accuracy of 5-fold CV is 90.19%. These results are far higher than the mere 33.33% accuracy of random guess. For the three categories, the upper limit of the 95% confidence interval for random guess is about 42.69%. It can be seen that our result is much higher than the upper limit of random guess by about 47.5%. Validate the significant effectiveness of the proposed descriptive features of symbols and texts of calculus exam questions on automatic difficulty level prediction.

參考文獻


Alkharusi, H. (2012). Categorical variables in regression analysis: A comparison of dum-my and effect coding. International Journal of Education, 4(2), 202-210. doi:10.5296/ije.v4i2.1962
Chang, S.-L., & Cheng, S.-C. (2017). Computer adaptive learning platform for calculus. In T.-T. Wu, R. Gennari, Y.-M. Huang, H. Xie, & Y. Cao (Eds.), Emerging technologies for education (pp. 153-162). New York, NY: Springer. doi:10.1007/978-3-319-52836-6_18
Dey, A., Chowdhury, S., & Ghosh, M. (2017). Face recognition using ensemble support vector machine. 2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), Kolkata, India. doi:10.1109/ICRCICN.2017.8234479
Dignath, C., & Büttner, G. (2008). Components of fostering self-regulated learning among students: A meta-analysis on intervention studies at primary and secondary school level. Metacognition and Learning, 3(3), 231-264. doi:10.1007/s11409-008-9029-x
Dignath, C., Büttner, G., & Langfeldt, H. P. (2008). How can primary school students learn self-regulated learning strategies most effectively? A meta-analysis on self-regulation training programmes. Educational Research Review, 3(2), 101-129.doi:10.1016/j.edurev.2008.02.003

延伸閱讀