透過您的圖書館登入
IP:3.138.110.119
  • 期刊

能力估計方法對多向度電腦化適性測驗測量精準度的影響

The Influences of the Ability Estimation Methods on the Measurement Accuracy in Multidimensional Computerized Adaptive Testing

摘要


本研究旨在分析不同能力估計方法對多向度電腦化適性測驗(multidimensional computerized adaptive testing, MCAT)測量精準度的影響。研究分為兩階段:第一階段先找出在MCAT中貝氏期望後驗法(expected a posteriori, EAP)的最佳節點數(quadrature point);第二階段是比較最大概似法(maximum likelihood, ML)、期望後驗法(EAP)與最大後驗法(maximum a posteriori, MAP)在不同向度(二向度與四向度)及不同相關性(低相關與高相關)的情況下,進行不同題數(20題、40題、60題、80題)MCAT時的能力估計信度、偏誤(bias)以及均方根誤(root mean square of error, RMSE)。階段-的結果顯示,隨著EAP節點數的增加(從5、30點)與能力向度的增加,其選題所需的時間會明顯地增加。在考量到選題時間又不致影響到測量精準度的情況下,在MCAT中將EAP的節點數訂為10是理想的選擇。階段二的結果顯示,MAP法與EAP法比ML法的能力估計信度高,均方根誤較低。在平均偏誤方面此三種方法則無明顯差異,不過MAP法會有明顯的廻歸性偏誤。這些現象在能力間相關較高、能力向度數量較多以及題數較少時會更明顯。整體而言,三種方法各有其優缺點,其中MAP法的廻歸性偏誤、EAP法的選題時間以及ML法的信度與測量誤差是未來進行MCAT時需要改善的問題。

並列摘要


The goal of the research was to investigate the influences of ability estimation methods on multidimensional computerized adaptive testing. In stage 1, different quadrature points of the Baysian expected a posteriori (EAP) estimation were manipulated in order to find out the appropriate quadrature point of EAP in multidimensional computerized adaptive testing (MCAT). In stage 2, the maximum likelihood (ML) estimation, the Bayesian maximum a posteriori (MAP) estimation, and the EAP estimation methods were used in two kinds of ability dimensions (two and four dimensions) and two kinds of correlations between dimensions (high correlations and low correlations). The target item numbers of MCAT were 20, 40, 60, and 80. The dependent variables were the average reliability, bias, and the root mean square of error (RMSE) in all ability dimensions. Results in stage 1 indicated that the higher the quadrature point and the ability dimensions, the much higher the estimation time of MCAT. Ten points was appropriate in less than 4 dimensions of MCAT when the estimation time and the reliability of ability estimation were taken into consideration. Results of stage 2 indicated that MAP and EAP methods resulted in higher reliability and lower RMSE than ML method, especially in the conditions of high correlation between abilities, more ability dimensions, and fewer MCAT items. There were advantages and disadvantages in the three estimation methods. The regression bias of MAP, the estimation times of EAP, and the reliability and RMSE of ML were the problems that should be resolved when executing MCAT.

參考文獻


洪碧霞、吳鐵雄、黃千綺、江秋坪、許宏彬(1992)。能力估計法、題庫特質及終止標準對CAT考生能力估計影響之研究。测验年刊。
陳柏熹、王文中(2000)。测驗組之題間多向度電腦化適性測驗。中華心理學會主辦「中華心理學會第三十九屆年會」宣讀之論文。
陳柏熹、王文中(2000)。題間與題內多向度電腦化過性測驗。中國測驗學會主辦「教育與測驗學術研討年會」宣讀之論文。
陳柏熹(2001)。題數限制與曝光率控制對多向度電腦化適性測驗之測量精確性與試題曝光率的影響(博士論文)。國立中正大學心理學研究所博士論文。
陳柏熹、王文中(2004)。曝光率控制對多向度電腦化適性測驗能力估計信度之影響:以2001年國中基本學力測驗資料為例。教育與心理研究。

被引用紀錄


王雅利(2005)。腦脊髓液檢體中流感病毒與腸病毒之分子診斷及化學激素成份分析〔碩士論文,中山醫學大學〕。華藝線上圖書館。https://doi.org/10.6834/CSMU.2005.00070
洪豪哲(2010)。以CEFR為基礎之華語閱讀測驗系統適性化機制之建立〔碩士論文,亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-1511201215465395
吳佳儒(2010)。電腦化適性預試對試題難度估計精準度之影響〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315203453
詹崇義(2012)。華語文聽力電腦化適性測驗成效分析─以CEFR A2 級為例〔碩士論文,亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-1511201215465399

延伸閱讀