次級量尺分數估計法於大型教育測驗之模擬研究

近幾年，次級量尺分數的估計方法與應用開始被重視，例如：國內外大型測驗(TIMSS、PISA、NAEP、TASA)的分數報告，均呈現不同能力向度之次級量尺分數。然而，雖然國外已有學者針對次級量尺分數之研究進行探討，但是國內部分目前尚無相關研究，且並沒有研究比較這些方法使用於等化測驗設計。因此，本研究主要以模擬實驗方式探討不同次級量尺分數計算方法於不同測驗情境中，對於單一測驗設計與等化測驗設計分數之估計效果。此外，本研究亦提出新的次級量尺分數計算方法，以比較不同次級量尺分數計算方法之差異。研究結果發現，本研究提出之新的次級量尺計算方法，於不同測驗情境中具有較佳之估計精準度。

關鍵字

大型測驗；次級量尺分數；測驗等化

並列摘要

The purpose of this paper is to explore subscale scores estimation in two testing design situations, single testing design and equating testing design. Additionally, two new methods to estimate subscale scores are presented in this paper. Using simulation data, this study investigates the accuracy of subscale scores estimation for different methods of estimating subscale scores. In single testing design, factors taken into consideration include the following: correlation between subscales, sample sizes, ratio of CR/MC items, numbers of subscales, and test length. In equating testing design, factors taken into consideration include the following: correlation between subscales, sample sizes, collocation of anchor items, and equating methods. The results show that: 1. New methods of estimating subscale scores are better than other methods. 2. The estimation error decreases as correlation between subscales increases; however, the sample sizes don't impact the estimation error. 3. In single testing design, the estimation error decrease as ratio of CR/MC items increase and the estimation error decrease as test length increase. 4. In equating testing design, the collocation of anchor items do not impact the estimation error and the concurrent calibration method based on item response theory has higher accuracy than equating calibration based on classical test theory.

並列關鍵字

large-scale assessments ； subscale scores ； test equating

參考文獻

洪碧霞、林素微、林娟如(2006)。認知複雜度分析架構對TASA-MAT六年級線上測驗試題難度的解釋力。教育研究與發展期刊。2(4)，69-86。

Google Scholar

楊孟麗、譚康榮、黃敏雄(2003)。台灣教育長期追蹤資料庫—心理計量報告：TEPS2001分析能力測驗第一版。台北市:中央研究院調查研究專題中心。

Google Scholar

Baxter, G. P.,Ahmed, S.,Sikali, E.,Waits, T.,Sloan, M.,Salvucci, S.(2007).Technical report of the NAEP Mathematics Assessment in Puerto Rico: Focus on statistical issues (NCES 2007-462rev).Washington, DC:National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.

Google Scholar

Bock, R. D.,Thissen, D.,Zimowski, M. F.(1997).IRT estimation of domain scores.Journal of Educational Measurement.34(3),197-211.

Google Scholar

Gessaroli, M. E.(2004).Using hierarchical multidimensional item response theory to estimate augmented subscores.the annual meeting of the National Council on Measurement in Education.(the annual meeting of the National Council on Measurement in Education).:

Google Scholar

國際替代計量

次級量尺分數估計法於大型教育測驗之模擬研究

全文下載

主題瀏覽