透過您的圖書館登入
IP:3.149.250.24
  • 期刊
  • OpenAccess

A Study on Chinese Spelling Check Using Confusion Sets and N-gram Statistics

並列摘要


This paper proposes an automatic method to build a Chinese spelling check system. Confusion sets were expanded by using two language resources, Shuowen Jiezi and the Four-Corner codes, which improved the coverages of the confusion sets. Nine scoring functions which utilize the frequency data in the Google Ngram Datasets were proposed, where the idea of smoothing was also adopted. Thresholds were also decided in an automatic way. The final system achieved far better than our baseline system in CSC 2013 Evaluation Task.

參考文獻


de Amorim, R.C.,Zampieri, M.(2013).Effective Spell Checking Methods Using Clustering Algorithms.Recent Advances in Natural Language Processing.(Recent Advances in Natural Language Processing).
Blair, C.(1960).A program for correcting spelling errors.Information and Control.3,60-67.
Carlson, A.,Fette, I.(2007).Memory-Based Context-Sensitive Spelling Correction at Web Scale.Proceedings of the 6th International Conference on Machine Learning and Applications.(Proceedings of the 6th International Conference on Machine Learning and Applications).
Carlson, A.,Rosen, J.,Roth, D.(2001).Scaling up context-sensitive text correction.Proceedings of the 13th Innovative Applications of Artificial Intelligence Conference.(Proceedings of the 13th Innovative Applications of Artificial Intelligence Conference).
Chang, C.H.(1994).A pilot study on automatic chinese spelling error correction.Journal of Chinese Language and Computing.4,143-149.

被引用紀錄


陳怡芬(2005)。台語聲調錯誤研究〔碩士論文,國立交通大學〕。華藝線上圖書館。https://doi.org/10.6842/NCTU.2005.00001
李啟維(2017)。基於隱藏式馬可夫模型的中文改錯〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU201701112
蔡秉倫(2008)。外國學生華語詞彙聲調的發音與感知〔碩士論文,國立清華大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0016-2002201314534034

延伸閱讀