標準設定效度驗證之探究－以大型數學學習成就評量爲例

本研究旨在探討標準設定的效度，就程序性、內部性與外部性等方面進行表現標準的效度驗證，一方面透過類推性理論與其他重要規準(criteria)，提供最終標準的內部性效度證據；另一方面，探究整體標準設定過程之品質，列舉程序性與外部性證據，以支持最終標準具備良好效度。本研究以一項大型數學學習成就評量為場景，為之設定基礎(basic)、精熟(proficient)、進階(advanced)等成就水準，設定材料為四、六年級評量的104個選擇題，延攬16名具備國小數學教材編纂或教學經驗之專家、教師組成設定小組，進行標準設定活動。在標準設定方法部分，採行修正的Angoff法設計標準設定程序，主要參考NAEP(Allen, Jenkins, Kulick, & Zelenak, 1997)、Reckase (1998, 2000)以及Hambleton (2001)所建議之方式與準則進行之，針對各個成就水準對應之通過分數進行標準設定之效度驗證，提出相關證據與效標來呈現表現標準具備的良好效度。本研究結果為，四年級標準設定結果呈現良好的內部性與程序性證據，可支持其表現標準之效度，惟在外部性證據上稍有瑕疵；六年級表現標準則在內部性、外部性與程序性方面，均獲致理想的效度證據支持。

關鍵字

Angoff法；效度驗證；標準設定

並列摘要

The purpose of the study was to provide the validation of the performance standards for cut scores in a large-scale mathematics assessment. In order to testify and examine the improvement of consistency, the study applied different procedures and approaches to the modified Angoff standard setting method, including feedbacks with empirical difficulties and results information from previous setting round. A generalizability study was performed to investigate the sources and magnitude of the variation of results in terms of several facets-item, panelist, and item difficulty. The important conclusions are summarized below: 1.Cut scores were gradually and logically increased in accordance with the achievement of basic, proficient, and advanced. However, item difficulty had a great impact on the judgments of panelists. Therefore, the necessity of multiple rounds, discussion and feedbacks in a multi-stage standard setting process was guaranteed. 2.As the setting rounds passed, the inconsistency between panelists was removed or adjusted, while error in the performance standards was declined to the lowest at the final round. The accuracy and consistency of cut scores in both grades were moderate. 3.Finally, evidence of procedural and internal focus showed that the validity of performance standards at each achievement level was built and proved. Also, it was suggested that evidence of external focus needs to be strengthened.

並列關鍵字

Angoff method ； standard setting ； validation

參考文獻

謝進昌(2005)。以最大測驗訊息量決定通過分數之研究(碩士論文)。國立政治大學教育學系教育與心理輔導組。

Allen, N. L.,Jenkins, F,Kulick, E.,Zelenak, C. A.(1997).Technical report of the NAEP 1996 state assessment program in mathematics.(Technical report of the NAEP 1996 state assessment program in mathematics).,::National Center for Education Statistics.

Google Scholar

Angoff W. H.,R. L. Thorndike (Ed.)(1971).Educational measurement.Washington, DC:American Council on Education.

Google Scholar

Berk, F. A.(1986).A consumer's guide to setting performance standards on criterion referenced tests.Review of Educational Measurement.56(1),137-172.

Google Scholar

Brandon, P. R.(2004).Conclusions about frequently studied modified Angoff standard-setting topics.Applied Measurement in Education.17(1),59-88.

Google Scholar

被引用紀錄

謝進昌（2021）。以「補充性表現水平描述輔助自陳式測量構念」之延伸Angoff標準設定研究。教育心理學報，53(2)，307-334。https://doi.org/10.6251/BEP.202112_53(2).0003

林小慧、吳心楷（2019）。科學探究能力評量之標準設定與其效度檢核。教育心理學報，50(3)，473-502。https://doi.org/10.6251/BEP.201903_50(3).0005

莊郁芳（2011）。高中學生公民與社會科學習成就影響因素之研究：2007年TASA資料庫分析〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315241848

國際替代計量

標準設定效度驗證之探究－以大型數學學習成就評量爲例

全文下載

主題瀏覽