透過您的圖書館登入
IP:44.202.183.118
  • 期刊

A Comparison of Adjusting Procedures for Minimizing Gaps in Raw-to-Scale Score Conversions for Large-Scale Assessments

大型測驗之原始至量尺分數轉換間距縮小調整法之比較研究

摘要


本研究探討四種不同的量尺分數間距縮小法:「無調整法」、「同等平均數法」、「不同等平均數法」,以及「不同等平均數及標準差法」,嘗試縮小測驗之原始分數經由正弦反函數轉換為量尺分數後所產生的分數間距。整個研究使用模擬資料進行,採用三參數extended beta-binomial模式,模擬產生一如基本學力測驗五個學科的分數分配。基本學力測驗為一標準化測驗,分數作為台灣高中入學的依據。研究中評鑑各科測驗在經由這四種不同的量尺分數間距縮小法調整過後量尺分數的特性,探討對於在量尺高分一端縮小分數間距至3、4、5分之後的效果。評鑑的準則包含量尺分數描述統計值、測驗信度、測驗誤差,以及在各個能力下的測量標準誤。研究結果指出,沒有任何一種方法能達到縮小量尺高分一端分數間距的目標,而卻不會有任何負面的效果產生。綜合各方面的優缺點來看,「不同等平均數及標準差法」似乎是最好的量尺分數間距縮小法。本研究結果對於縮小基本學力測驗以及其他類似大型測驗量尺分數間距之可行性上應有所貢獻,也提供了測驗研究者與實務工作者在建立最佳量尺時,對於有關量尺分數間距大小應考量的議題上,一些新的建議或方向。

並列摘要


The present study investigated four adjusting procedures for minimizing the gaps resulting from raw-to-scale score conversions via the arcsine transformation. The four methods, namely, the no adjustment, the fixed mean, the varying mean, and the varying mean/SD were compared using the data simulated based on the three-parameter extended beta-binomial model for the five tests in the Basic Competence Test (or BCTEST), a national standardized assessment program in Taiwan. The desired gap sizes were set at 3, 4, and 5 scale score points at the high end of the scale. The criteria for comparing the adjustment methods were by means of the summary statistics, the reliability, the overall SEM, and the SEMs by true score in proportion-correct score units. The findings indicated that no one method could accomplish the goal of reducing the gaps at the high end of the scale without adversely affecting the other scale properties. Considering all of the advantages and disadvantages in this research, the varying mean/SD strategy was judged the most preferable. With the exploration of various gaps-minimizing approaches by relaxing criteria of the desired scale score characteristics, this study should have shed some light on the possibilities of reducing gap sizes for the BCTEST in particular and large-scale assessments with gaps problem in general. Results from this study should have offered psychometric researchers and test practitioners a broader perspective on the gaps issues while designing the score scales to meet technically sound measurement qualities and the practical demands as well.

參考文獻


Carlin, J. B.,Rubin, D. B.(1991).Summarizing multiple-choice tests using three informative statistics.Psychological Bulletin.110,338-349.
Chang, S. W.(2006).Methods in scaling the Basic Competence Test.Educational and Psychological Measurement.66(6),907-929.
Keats, J. A.,Lord, F. M.(1962).A theoretical distribution for mental test scores.Psychometrika.27,59-72.
Kolen, M. J.(1988).Defining score scales in relation to measurement error.Journal of Educational Measurement.25,97-110.
Kolen, M. J.,Brennan, R. L.(2004).Test equating, scaling, and linking: Methods and practices.New York:Springer Science+Business Media, Inc..

延伸閱讀