A Model for Word Sense Disambiguation

Word sense disambiguation is one of the most difficult problems in natural language processing. This paper puts forward a model for mapping a structural semantic space from a thesaurus into a multi-dimensional, real-valued vector space and gives a word sense disambiguation method based on this mapping. The model, which uses an unsupervised learning method to acquire the disambiguation knowledge, not only saves extensive manual work, but also realizes the sense tagging of a large number of content words. Firstly, a Chinese thesaurus Cilin and a very large-scale corpus are used to construct the structure of the semantic space. Then, a dynamic disambiguation model is developed to disambiguate an ambiguous word according to the vectors of monosemous words in each of its possible categories. In order to resolve the problem of data sparseness, a method is proposed to make the model more robust. Testing results show that the model has relatively good performance and can also be used for other languages.

並列關鍵字

natural language processing ； word sense disambiguation ； unsupervised learning ； vector space ； language modeling

參考文獻

Charles, Walter G.,Miller, George A.(1991).Contextual correlates of semantic similarity.Language and Cognitive Processes.6(1),1-28.

Google Scholar

Firth, J. R.(1951).Papers in Linguistics 1934-51.

Google Scholar

Ide, Nancy,Veronis, Jean(1998).Computational Linguistics Special Issue on Word Sense Disambiguation.Computational Linguistics.24(1),1-42.

Google Scholar

Kesk, Michael(1986).Proceedings of the 1986 SIGDOC conference.

Google Scholar

Schutze, Hinrich(1998).Automatic Word Sense Discrimination.Computational Linguistics.24(1),97-123.

Google Scholar

被引用紀錄

陳宛琳（2014）。結合本體論與語意相似程度對文件萃取關鍵字〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201400983

國際替代計量

A Model for Word Sense Disambiguation

全文下載

主題瀏覽