本研究目的在於比較最大訊息法與模擬退火法的選題策略之成效,以隨機法為比較的基礎線,測驗資料來源為九十年度基本學力測驗國文與數學二科,國文科二次測驗與數學科二次測驗經由因素分析檢核,皆符合項目反應理論中單一向度的假設,由三參數模式分析測驗資料所得題目訊息量、鑑別度與難度輸入選題策略中,本研究採用兩種電腦適性測驗的終止標準:第一種終止標準是設定題數上下限和時間上下限,第二種終止標準是設定題數20題以內或標準誤0.4以內,而比較最大訊息法與退火法成效的依據是標準誤的大小。 在第一種終止標準下,退火法的結果都比最大訊息法與隨機法的標準誤小很多,在第二種終止標準下,除了第一次國文科退火法的標準誤較其他方法大一些之外,退火法所使用的題數較其他方法多一些,而標準誤卻遠比其他方法小很多。 本研究建議未來方向可以進一步比較退火法與基因演算法選題的成效,或者是依照各種能力分組來比較各種選題策略成效。
The purpose of the present study is to assess the effect of two item selection strategies. The two strategies are Max Information Method (MIM) and Simulated Annealing (SA). The random method (RM) is used as baseline for the comparison. The test data are the basic ability tests used for entrance examination for high schools. Two verbal tests and two mathematics tests are chosen for data analysis. The unidimensionality assumption is confirmed by factor analysis. The item response theory is applied to the test data. The item information, item discrimination and difficulty parameters resulted from the three-parameter model are implemented into computer program. There are two stopping rules for computerized adaptive testing process. First, there are lower and upper limits for test length and testing time. Secondly, the test length is less or equal to twenty items or standard error is less or equal to 0.4. The criteria for comparison is the size of standard error of estimates. The results show that under the first stopping rule, the standard error of SA is much smaller than MIM and RM for all the test data except one verbal test. Under the second stopping rule, the test length of SA is slightly longer than MIM and RM, however the standard error of SA is much smaller than the other two strategies. The future studies could compare SA and other strategies such as the genetic algorithm. The comparison could also be made on different ability groups to assess the generalization of the conclusion of present study.