透過您的圖書館登入
IP:3.139.81.58
  • 學位論文

中文詞彙網路的詞義消歧

Word Sense Disambiguation with Chinese Wordnet

指導教授 : 謝舒凱

摘要


本篇論文試圖建立以中文詞彙網路的詞義為基礎的詞義標記系統。 中文詞網提供完整的中文詞義區分與詞彙語意關係知識庫,為一重要 的詞彙語意表徵。建造詞義標記系統需要詞義消歧(Word Sense Disambiguation)的技術,這是一個在自然語言處理中古老但尚未解決的 問題。本篇論文對於這個問題使用的方法是監都式學習(Supervised Learning)和文字嵌入(Neural Word Embedding)在某些詞彙上進行實 驗,訓練的語料來自中研院平衡語料庫和 PTT 語料庫且經過人工標記 詞義。LOPE Text Analytics 是這個研究的實際應用,它為研究者提供詞 義標記結合其它語言分析模組的系統。

並列摘要


The aim of the thesis attempts to establish a sense tagger for the Chinese Wordnet, which is an important representations of lexical semantics and provides distinction of senses and lexical semantic relations. The construction of the sense tagger requires the techniques of Word Sense Disambiguation (WSD), which is an old but still unsolved problem in Natural Language Processing (NLP). The approaches of the study involve the supervised learning methods and the neural word embeddings for certain lexical samples. The training corpora are the human annotated texts from Sinica Corpus and PTT Corpus. The LOPE Text Analytics, which is the implementation of the study, provides several applications including sense tagger and other text analytic modules for researchers.

參考文獻


[59] 高佩如. “語境預測力對中文一詞多義處理歷程的影響: 文句閱讀的眼動研究”. In: (2011).
[31] Andrew L Maas and Andrew Y Ng. “A probabilistic model for semantic word vectors”. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2010.
[6] Danqi Chen and Christopher D Manning. “A fast and accurate dependency parser using neural networks”. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Vol. 1. 2014, pp. 740–750.
[7] Hsin-Hsi Chen and Chi-Ching Lin. “Sense-tagging Chinese corpus”. In: Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics Volume 12. Association for Computational Linguistics. 2000, pp. 7–14.
[1] Mohamed Aly. “Survey on multiclass classification methods”. In: Neural Netw (2005), pp. 1–9.

延伸閱讀