透過您的圖書館登入
IP:3.144.97.189
  • 期刊

測驗向度數評估方法的比較

A Performance Comparison of Test Dimensionality Assessment Methods

摘要


本研究旨在模擬具有簡單結構以及部分複雜結構的多向度資料,透過操弄向度個數(1、2、3)、各向度能力值之間的相關(0、0.3、0.6)、各向度題數(10、20題)、以及樣本人數(250、500、1000、2000人)等條件,每個條件組合下模擬100次,以比較DETECT(Kim, 1994)、NOHARM(McDonald, 1996)、平行分析(PA; Horn, 1965)和HULL法(Ceulemans & Kiers, 2006)等四種測驗向度評估方法之表現,以各種方法正確辨認向度個數的百分比,作為比較各方法表現之標準。研究主要發現如下:(1)模擬資料分析結果顯示,當資料是單一向度時,DETECT無法順利得到一個單向度的解;資料是二向度時,因資料結構與DETECT的理論不是十分一致,因此DETECT的表現不佳,其他三種方法皆優於DETECT;當資料是簡單結構的三向度時,DETECT的表現比在二向度的情形好。向度之間的相關小於0.3時,PA的表現最佳,向度之間的相關為0.6時,DETECT和NOHARM比PA及HULL法好。如果資料是3d1的(即資料雖然是用三向度的MIRT模式產生,但因為試題箭頭指向同一個方向,實際上可說是單一向的),則各種方法的表現情形與單一向度時一樣。(2)在各向度能力值之間的相關為0.6以下,PA和HULL的表現優於DETECT和NOHARM,但是當各向度能力值之間的相關變大時,DETECT和NOHARM的表現優於PA和HULL。(3)當試題數增加時,四種方法的正確模式判斷率也會隨之提升。(4)NOHARM、PA和HULL三種方法似乎較不受樣本大小改變而影響,對DETECT而言,當樣本大小增加時,正確模式判斷率也會隨之增加。

關鍵字

向度數 DETECT NOHARM 平行分析 HULL

並列摘要


The purpose of this study was to investigate the performance of four dimensionality assessment procedures, namely DETECT (Kim, 1994), NOHARM (McDonald, 1996), Parallel Analysis (Horn, 1965) and HULL method (Ceulemans & Kiers, 2006), in terms of their accuracy of identifying the numbers of dimensions given by different multidimensional data sets. With the manipulation of the number of dimensions, the correlation among the dimensions, the number of items per dimension, and the sample size, simulated responses were generated under different conditions, for each of the 100 replications per condition. The main findings were as the following: Firstly, when the data is unidimensional, DETECT is not able to obtain the proper one-factor solution. Gor the two-dimensional case, DETECT's performance is not good as expected due to the fact that some of the item response data was not generated to be within-item multidimensional. DETECT performs better in the three-dimensional case than in the two-dimensional case, because the item response data was generated to be between-item multidimensional one. Parallel aanalysis performs better than than other methods when the correlation between domain abilities is less than .3, and DETECT and NOHARM outperforms Parallel analysis and HULL when the correlation becomes .6. For the so-called 3d1 data, in which the item response was generated using a M2PL model, but all items point to the same direction in the latent space, all methods give similar results as was the unidimensional case. Secondly, the PA and HULL outperformed DETECT and NOHARM when the correlation among dimensions was 0.3 or lower, and the DETECT and NOHARM outperformed PA and HULL when the correlation was 0.6 or higher. Thirdly, as the number of items increased, the accuracy of identifying the numbers of dimensions was also increased for all procedures. Finally, sample size seem did not affect NOHARM, PA and HULL, but when sample size increases, the performace of the DETECT procedure improves.

並列關鍵字

dimensionality DETECT NOHARM Parallel Analysis HULL

參考文獻


陳榮華、吳明雄、陳心怡(2010)。新編多元性向測驗。台北:中國行為科學社。
Ackerman, T. A.(1992).A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective.Journal of Educational Measurement.29(1),67-91.
Ackerman, T. A.,Gierl, M. J.,Walker, C. M.(2003).Using multidimensional item response theory to evaluate educational and psychological tests.Education Measurement: Issues and Practice.22(3),37-53.
Adams, R. J.,Wilson, M.,Wang, W. C.(1997).The multidimensional random coefficients multinomial logit model.Applied Psychological Measurement.21(1),1-23.
Akaike, H.(1973).Information theory and an extension of the maximum likelihood principle.2nd International Symposium on Information Theory.(2nd International Symposium on Information Theory).:

延伸閱讀