BIB與NEAT設計在不同年度測驗連結效果之比較

近年來教育資料庫普遍受到重視，因為透過資料庫的建立可以了解全國學生之學習成效，同時可以瞭解比較不同年級、不同年度間的變化。因此，建立共同量尺是非常重要的課題。本研究以試題反應理論(item response theory, IRT)之三參數羅吉斯模式(three-parameter logistic model)為理論基礎探討利用平衡不完全區塊(balancedincomplete block, BIB)設計與定錨不等組設計(non-equivalent groups with anchortest design, NEAT)兩種連結設計在進行大型教育測驗等化時，對於同年級不同年度間等化之連結效果，並針對受試人數、試題數、受試者能力分布、定錨比例及難度範圍等項目進行模擬實驗。本研究結果發現在常態分布中，能力參數與試題參數估計誤差會隨著人數增加而降低，並會隨著試題數增加而提高；在不同定錨比例中，大致上以定錨比例為30%有較佳的連結效果。另外，在不同難度範圍中，大致上以難度範圍為－1~1或－2~2有較佳的連結效果。

關鍵字

平衡不完全區塊設計；定錨不等組設計；定錨試題；試題等化

並列摘要

The main purpose of this study is to explore the linking performance of two large-scale educational assessments which were administrated in different years. Balanced incomplete block (BIB) and non-equivalent groups with anchor test design (NEAT) are two popular test equatinf methods in most of large-scale educational assessments. The effects of numbers of people, numbers of items, ability distributions, the percentage of anchor items and ranges of anchor items are explored under two different linking methods. Three types ranges of difficulty parameters, (-3, 3), (-2, 2), and (-1, 1) are considered in this study. The results of simulation study show that: When the data follow normal distribution, the equating performance decreases as the numbers of people increases, and increases as the numbers of items increases. The better equating performance occurs as the percentage of anchor items is 30%. The best equating performance occurs when the range of difficulty parameters is (-1, 1) or (-2, 2).