校務檔案怎麼評？校務經營檔案之信效度分析

校務經營能力為候用校長的必備職能之一。過去儲訓校長班繳交許多作業與文檔，內容聚焦為校長校務經營的各種面向的培養，但卻缺乏系統性的蒐集。本研究設計候用校長設計校務經營的檔案評量，並探討其信效度。評分者的評分嚴苛度、評分規準、評分者是否為師傅校長或是外聘專家對於受試者的影響亦加以探討。其結果顯示候用校長在校務檔案的製作中，對於學校背景資料掌握較好，對於創新經營的面向較差，外聘專家比師傅校長評分較為嚴格，但外聘與師傅校長評分均受到誤差的影響。總體而言，校務經營檔案評量具有大致良好的幅合效度證據，但區辨效度較不理想。

關鍵字

校務經營；檔案評量；信度；效度

並列摘要

Portfolio assessment provides an opportunity to understand characteristics about a student that cannot be gleaned from responses to traditional test questions; the characteristics obtained through portfolio assessment are more integrated and reflect a more comprehensive picture of a student's learning experiences. Portfolio assessment has become more popular in recent years as a tool for measuring a student's suitability for admission into university. Portfolio assessment allows a learner to self-monitor their progress and achievement of learning goals through a systematic collection of assignments. The role of an instructor is to guide a student in constructing a portfolio, to set learning goals, and to provide opportunities for self-assessment and peer feedback. Many scholars and experts in Taiwan have built a rich theoretical foundation and performed practical research on portfolio assessment at the university, elementary, and secondary levels; few studies have investigated the use of portfolio assessment in other contexts. This study investigates the reliability and validity of portfolio assessment in the context of school management and, in particular, their use by preservice principals. The influence of scorer severity, scoring rubrics, and the influence of internal or external experts was investigated. In current preservice principal training courses, preservice principals are given many assignments. Although the assignment content focuses on the various aspects of school management, a systematic collection of assignments is lacking. In particular, challenges with practical implementation include the availability of too many assignments and the assignments having overlapping purposes. These disorganized and unsystematic assignments are a burden to both preservice principals and scorers. Based on the six professional competency indicators of preservice principals (visioning, strategic thinking, teamwork, communication and coordination, innovative management, and self-reflection), this study systematically constructs a course that uses portfolio assessment. However, evidence of the validity of the use of portfolio assessment in the context of school management must be further investigated. Preservice principal training courses cover a variety of topics, including school development, administration, professional responsibility, public relations, curriculum and instructional leadership, educational visitation, teacher learning, liberal arts, and integrated activities; such courses also often incorporate aspects of mentorship. Mentorship learning is designed to help preservice principals finish their assessments. Typically, six mentoring sessions from the senior school principals are provided on the content and practices required in the school development portfolio. In addition to written assignments, individual sharing is required, and mentors provide feedback. In total, 61 preservice principals who enrolled in the 8-week training course participated in this study. Each student was required to produce a portfolio of assignments related to school management that covered five major topics: (1) school background information, (2) goals and action plans, (3) school curriculum design, (4) school innovation plan, and (5) selfreflection. These five components are a primary focus of school development. Each class had two coaches and two external experts who scored the portfolios. A total of 12 experts were involved in the scoring process. At the start of the course, the coaches of each class introduced the function of the portfolios and explained the key content that must be included in each section. Students were asked to upload the content of their portfolios two to three times to the system as the course progressed. For example, after the "Data-Driven School Research" seminar, students were asked to complete an inventory of school records, followed by questions and forms to guide them through the process . At each step, the coaches shared, discussed, and advised students before moving on to the next section. An overall grade was given after completion of the entire portfolio. Reliability was assessed using the multifacet Rasch model (MFRM). Item difficulty and scorer severity were both considered when assessing ability. The validity dimension examined convergent and discriminant validity. Validity was examined by correlating portfolio scores with test scores and other assignment scores, and scores were based on daily observations by the coaches . The mean square of infit and outfit for each level are close to 1, which represents goodness of fit. In the Rasch model, reliability represents the reproducibility of the measure; ability as measured in a test with high reliability is more likely to be consistent with ability in the real world. Most reliability indices are relatively high; the lowest values occur at the estimation of the ability of preservice principals, which may be due to the high homogeneity and similarity of the preservice principals. Although the sample size was not large, the fit of the model was acceptable, which means that the data in this study are suitable for analysis using the MFRM. Students performed poorly in creating innovative plans but performed well in school background information, which may be because most of these students tend to come from backgrounds without much opportunity to make decisions on the direction of school affairs. For example, having a background as a director predisposes an individual to being uncertain about the direction of school innovation, whereas it is trivial for a student of any background to present information related to a school's background. The external experts were more severe than the coaches, probably because as mentors, the coaches developed close relationships with the students, and this made it difficult for the coaches to give the students low scores. The final study found a strong correlation between portfolio scores and oral scores. Oral tests were intended to provide an opportunity for students to present their school management philosophies. A significant correlation between portfolio scores and weekly reflection journal scores was also observed and is an indication of validity. A significant correlation between portfolio and behavioral performance scores was also observed, possibly because scoring of both of these items was performed by coaches and may therefore have been affected by the halo effect, resulting in lower discriminant validity. Although the scorers are experts in the field, they still need to be trained in scoring. In the future, we should design a rigorous scorer training program. For example, we can place some calibration cases in the scoring process to examine rater bias and make adjustments immediately. Exploring how to design a more convenient and easier-to-use model of scorer training to increase consensus among scorers and reduce subjective bias is noteworthy. In this study, both internal and external experts had different sources of errors in scoring; external experts were more biased when evaluating contextual data. Therefore, in future courses, external experts should be tasked with scoring factual information and internal experts (e.g., mentor principals) should be tasked with scoring contextual information, such as the rationale for or consistency of a portfolio. Exploring how the halo effect might be avoided is another potential area of improvement. Incorporating anonymous cross-class grading or peer-graded comments and suggestions might help avoid the halo effect. Cross-disciplinary collaboration should be considered as a way of enhancing innovative thinking. Team activities, such as brainstorming as a team, may be useful in helping students generate innovative ideas. In conclusion, this study evidenced the reliability and validity of portfolio assessment in the context of school management. However, careful consideration is still required to determine if it is suitable as an examination tool.

並列關鍵字

school management ； portfolio assessment ； reliability ； validity

參考文獻

劉祥熹、陳玉娟、鄭筱慧（2016）：〈學校創新經營對家長選校意願影響之研究－以服務品質與學校形象為中介變項〉。《教育科學研究期刊》，61（4），59-88。[Liu, H.-H., Chen, Y.-C., & Cheng, H.-H. (2016). Impacts of school innovation management on school selecting intentions: Service quality and school image as mediators. Journal of Research in Education Sciences, 61(4), 59-88.] https://doi.org/10.6209/JORIES.2016.61(4).03

謝名娟（2017）：〈誰是好的演講者？以多層面 Rasch 來分析校長三分鐘即席演講的能力〉。《教育心理學報》，48，551-566。[Hsieh, M.-C. (2017). Who is a good speaker? applying multifaceted Rasch model to analyze principal three-minute impromptu speech. Bulletin of Educational Psychology, 48, 551-566.] https://doi.org/10.6251/BEP.20160801

謝名娟（2020）：〈從多層面Rasch模式來檢視不同的評分者等化連結設計對參數估計的影響〉。《教育心理學報》，52，415-436。[Hsieh, M.-C. (2020). Investigating the effects of rater equating designs on parameter estimates in the context of preservice principal oral performance. Bulletin of Educational Psychology, 52, 415-436.] https://doi.org/10.6251/BEP.202012_52(2).0008

石文傑、馮啟峰、劉偉欽、羅聰欽（2014）：〈高職校長科技領導能力指標之探討〉。《科技與工程教育學刊》，47（2），1-14。[Shyr, W.-J., Feng, C.-F., Liu, W.-C., & Lo, T.-C. (2014). The exploration of principal technology leadership competency indicators epistemologies for vocational high school. Journal of Technology and Engineering Education, 47(2), 1-14.] https://doi.org/10.6232/JTEE.201412_47(2).0001

林素卿、葉順宜（2014）：〈檔案評量於國中英語教學應用之個案研究〉。《教育科學研究期刊》，59（2），111-139。[Lin, S.-C., & Yeh, S.-I. (2014). Applying a portfolio assessment of English teaching in a junior high school: A case study. Journal of Research in Education Sciences, 59(2), 111-139.] https://doi.org/10.6209/JORIES.2014.59(2).05

國際替代計量

校務檔案怎麼評？校務經營檔案之信效度分析

全文下載

主題瀏覽