Building a Bracketed Corpus Using Φ^2 Statistics｜Airiti Library 華藝線上圖書館

透過您的圖書館登入 IP:18.117.107.90

透過您的圖書館登入

IP:18.117.107.90

繁體中文
English
简体中文

精確檢索 : 冠狀病毒
模糊檢索 : 冠狀病毒
冠狀病毒感染

冠狀病毒疾病
查詢出版品: 冠狀病毒

主題瀏覽

【下載完整報告】AI熱潮從學術研究也能看出端倪？哪些議題是2023熱搜議題？

期刊
OpenAccess

Building a Bracketed Corpus Using Φ^2 Statistics

Yue-Shi Lee ； Hsin-Hsi Chen

《中文計算語言學期刊》 2卷2期 (1997/08) Pp. 1-23

https://doi.org/10.30019/IJCLCLP.199708.0001

並列摘要

Research based on treebanks is ongoing for many natural language applications. However, the work involved in building a large-scale treebank is laborious and time-consuming. Thus, speeding up the process of building a treebank has become an important task. This paper proposes two versions of probabilistic chunkers to aid the development of a bracketed corpus. The basic version partitions part-of-speech sequences into chunk sequences, which form a partially bracketed corpus. Applying the chunking action recursively, the recursive version generates a fully bracketed corpus. Rather than using a treebank as a training corpus, a corpus, which is tagged with part-of-speech information only, is used. The experimental results show that the probabilistic chunker has a correct rate of more than 94% in producing a partially bracketed corpus and also gives very encouraging results in generating a fully bracketed corpus. These two versions of chunkers are simple but effective and can also be applied to many natural language applications.

並列關鍵字

Bracketed Corpus ； Probabilistic Chunkers ； Treebank ； Φ^2 Statistics

參考文獻

Atwell, E.(1994)。Proceedings of the Balancing Act-Combining Symbolic and Statistical Approaches to Language。

Google Scholar

Black, E.(1991).Proceedings of DARPA Speech and Natural Language Workshop.

Google Scholar

Bod, R.(1993).Proceedings of 6th European Chapter of ACL.

Google Scholar

Brill, E.(1992).Proceedings of Applied Natural Language Processing.

Google Scholar

Brill, E.(1993).Proceedings of 33rd Annual Meeting of ACL.

Google Scholar

被引用紀錄

曾羽華（2010）。盲文點字應用於手機文字輸入之創新設計研究〔碩士論文，大同大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0081-3001201315110628

延伸閱讀

Wang, S. P. R., & Chiu, J. S. (2008). Extracting Frequent-Instruction Sequences from a Program Trace. 科學與工程技術期刊, 4(4), 1-6. https://doi.org/10.7117/JSET.200812.0001
黃鴻昇（2010）。Automatic Construction of a Multi-Level Corpus from Search Query Logs for Learning to Ranking Applications〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2010.01666
Kadhim, H. A., AlAwadi, A. H., & Sarvghadi, M. A. (2016). Experimental Study of Parallelizing Breadth First Search (BFS) Algorithm. Research Journal of Applied Sciences, Engineering and Technology, 12(4), 465-472. https://doi.org/10.19026/rjaset.12.2386
Bunnag, D. (2014). Combining Interval Branch and Bound and Stochastic Search. Abstract and Applied Analysis, 2014(), 775-789-1476. https://doi.org/10.1155/2014/861765
Sato, T., Sugitani, K., Hamano, A., & Arita, I. (2002). Evaluating Influence of Compiler Optimizations on Data Speculation. Journal of Information Science and Engineering, 18(6), 1027-1036. https://doi.org/10.6688/JISE.2002.18.6.4

國際替代計量

Building a Bracketed Corpus Using Φ^2 Statistics