透過您的圖書館登入
IP:18.209.31.38
  • 期刊

Choice of Weighting Scheme in Forming the Composite

量尺總分加權機制之探討

摘要


本研究探討與比較使用「等比重加權」、「信度加權」、「標準差加權」、「測量誤差加權」,以及「有效分數加權」這五種不同的加權機制模式建立量尺總分的效果。研究的目的在探索適合各測驗學科的最佳名義權重,以致能形成最合適的量尺總分,同時也提供將測驗學科的測量特性與分數分配的特徵列入計算各科權重的過程後,更多關於這些不同模式的有用訊息。本研究使用國民中學學生基本學力測驗的五科測驗進行,樣本採自民國94年考生分數5,000筆的隨機資料。研究評鑑各加權機制模式效果的準則包含,各學科分數與加權後總分的統計與測量方面的特性、各學科對總分的有效貢獻量,以及之於高中入學選擇決定的影響。研究結果指出,經由不同加權機制模式所形成的量尺總分之信度係數都很高。然而,當一一檢視各學科對總分的有效貢獻量時,不同模式的效果卻有極大的差異。雖然使用「標準差加權」與「測量誤差加權」模式仍無法使每個測驗學科在對總分的有效貢獻量上達到大致相當的最佳目標,但整體而言,「標準差加權」與「測量誤差加權」這兩種加權機制模式的表現仍比「信度加權」或「有效分數加權」的模式來得好。本研究的結果與建議,探討如何將各科測驗分數作最合適的組合以及有關量尺總分的相關議題,對於測驗的研究或實務方面都可提供相當的助益。

並列摘要


The present study investigated and compared the results of establishing composite scores based on the five weighting schemes of the equally-weighted model, the reliability weighting model, the standard deviation (or SD) weighting model, the error of measurement weighting model, and the effective score point model. The purpose of this study was to seek optimal relative weights in forming the best composite possible, as well as to offer more information about the various weighting schemes considered on the different measurement qualities of the tests. The five tests of the Basic Competence Test (BCTEST) were employed for exploration in this research. A random sample of 5,000 examinees drawn from the data obtained from the 2005 test administration was used. To evaluate the various weighting schemes, this study examined the statistical and psychometric properties of the test scores and the weighted composite scores, the effective contributions of individual tests to the composites, as well as the impact on the admission decisions. The findings indicated that the reliability coefficients of the variously formed composite scores were all very high. However, with regard to the effective contributions, the results were very different among the various weighting schemes. Overall, the SD and the error of measurement weighting models seemed to perform better in establishing the composites than the reliability or the effective score point model, although there still remained the issue of inequality of the effective contributions for both the SD and the error of measurement models. Results from this study should advance the understanding of weighting issues while combining individual test components into composites.

參考文獻


Allen, M. J.,Yen, W. M.(1979).Introduction to measurement theory.Monterey, CA:Brooks/Cole.
Carlson, J. E.(2006).Issues in differential weighting of items in IRT scoring.Paper presented at the annual meeting of the National Council on Measurement in Education.(Paper presented at the annual meeting of the National Council on Measurement in Education).:
Chang, S. W.(2006).Methods in scaling the Basic Competence Test.Educational and Psychological Measurement.66(6),907-929.
Feldt, L. S.(2004).Estimating the reliability of a test battery composite or a test score based on weighted item scoring.Measurement and Evaluation in Counseling and Development.37(3),184-190.
Gulliksen, H. O.(1950).Theory of mental tests.New York:Wiley.

延伸閱讀