透過您的圖書館登入
IP:13.59.82.167
  • 期刊

中文文本可讀性探討:指標選取、模型建立與效度驗證

Investigating Chinese Text Readability: Linguistic Features, Modeling, and Validation

摘要


本研究根據中文特性發展可讀性指標,接著建立中文文本可讀性數學模型,並進行模型效度驗證。本研究以所發展24個可讀性指標為預測變項,386篇教科書文章之年級值為效標變項,建立逐步迴歸(stepwise regression)與SVM可讀性數學模型,再以96篇新文章為測試資料進行模型驗證。研究結果顯示:在逐步迴歸模型中,難詞數、單句數比率、實詞頻對數平均與人稱代名詞數為重要的預測變項;以SVM模型F-score方法所得的重要預測變項則為難詞數、二字詞數、字數與中筆畫字元數等。逐步迴歸模型與SVM模型對新文章的預測正確性分別為55.21%及72.92%,兩種模型預測低年級文章之正確性均高於高年級文章。

並列摘要


This study aims to (a) develop readability indicators based on the textual factors that influence reading comprehension; (b) construct the readability model for Chinese text; and (c) validate the proposed readability models. This study constructs readability models employing step regression and SVM, using 24 readability indicators as its predictive variable and the grade level of 386 textbook articles as the criteria. The proposed models are then validated according to an additional 96 texts. The results show that in step regression, the critical predictors are the number of complex words, proportion of simple sentences, average logarithm of content word frequency, and number of personal pronouns. In the SVM model, the critical predictors selected by using the F-score include the number of complex words, number of two-character words, number of characters, and number of intermediate-stroke characters. The accuracy rates of step regression and SVM are 55.21% and 72.92%, respectively. Both models predict the texts more accurately at the lower grade levels than at the higher grade levels.

參考文獻


中文詞知識庫小組(1993)。中文詞類分析(三版)。台北=Taipei:中央研究院資訊科學所中文詞知識庫小組=Academic Sinica Institute of Information Science。
朱德熙(1982)。語法講義。北京=Beijing:商務印書館=Commercial Press。
何永清(2005)。現代漢語語法新探。台北=Taipei:商務印書館=Commercial Press。
孫德金(2002)。漢語語法教程。北京=Beijing:北京語言文化大學=Language and Culture University Press。
荊溪昱()。,未出版。

被引用紀錄


姚人鈺(2016)。強制揭露企業社會責任報告書與可讀性之關聯〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU201600574
陳茹玲、蔡鑫廷、宋曜廷、李宜憲(2015)。文本適讀性分級架構之建立研究教育科學研究期刊60(1),1-32。https://doi.org/10.6209/JORIES.2015.60(1).01
張家翎、邱銘心(2020)。How Much Do Pregnant Women Know? An Exploratory Study on the Readability of Frequently-used Nutrition Terms圖書資訊學刊18(2),139-165。https://doi.org/10.6182/jlis.202012_18(2).139
陳昭珍、宋曜廷、章瓊方、曾厚強(2020)。Examining the Differences of Readability Leveling of Chinese Popular Science Books by Experts and by CRIE System for Elementary School Children圖書資訊學刊18(1),45-67。https://doi.org/10.6182/jlis.202006_18(1).045
劉佩雲(2019)。多元閱讀策略教學對摘要與閱讀理解能力效果之研究師資培育與教師專業發展期刊12(3),1-27。https://doi.org/10.3966/207136492019121203001

延伸閱讀