透過您的圖書館登入
IP:3.145.111.183
  • 期刊
  • OpenAccess

The Use of Clustering Techniques for Language Modeling-Application to Asian Language

並列摘要


Cluster-based n-gram modeling is a variant of normal word-based n-gram modeling. It attempts to make use of the similarities between words. In this paper, we present an empirical study of clustering techniques for Asian language modeling. Clustering is used to improve the performance (i.e. perplexity) of language models as well as to compress language models. Experimental tests are presented for cluster-based trigram models on a Japanese newspaper corpus and on a Chinese heterogeneous corpus. While the majority of previous research on word clustering has focused on how to get the best clusters, we have concentrated our research on the best way to use the clusters. Experimental results show that some novel techniques we present work much better than previous methods, and achieve more than 40% size reduction at the same level of perplexity.

並列關鍵字

無資料

參考文獻


Brown, Peter F.,Della Pietra, Stephen A.,Pietra Della, Vincent J.,Jelinek, Fredick,Lafferty, John D.,Cocke, John,Roossin, Paul S.,Mercer, Robert L.(1990).A Statistical Approach to Machine Translation.Computational Linguistics.16(2),79-85.
Brown, Peter F.,Della Pietra, Vincent J.,deSouza, Peter V.,Lai, Jennifer C.,Mercer, Robert L.(1992).Class-Based N-gram Models of Natural Language.Computational Linguistics.18(4),467-479.
Chow, Y. L.,Bellegarda, J. R.,Butzberger, J. W.,Coccaro, N. B.,Naik, D.(1996).ICASS-96.
Kenneth, K. K. W., K. W.(1988).ACL Proc. 2nd Conf. on Applied Natural Language Processing.
Fung, P.,Nguyen, L.,Placeway, P.,Schwartz, R.(1993).Proceedings of ICASSP-93.

被引用紀錄


施孟漢(2011)。以虛擬相關回饋為基礎之相關詞建議〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://doi.org/10.6841/NTUT.2011.00142
洪大弘(2009)。基於語言模型及正反面語料知識庫之中文錯別字自動偵錯系統〔碩士論文,朝陽科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0078-0801201511153723
Wei, C. Y. (2014). Automatic Correction of Grammatical Errors in English learner Writing [master's thesis, National Tsing Hua University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0016-2912201413553064

延伸閱讀