透過您的圖書館登入
IP:18.191.223.123
  • 學位論文

利用本體論及後適配技術於產生較佳之詞及詞義分散表示法

On Utilization of Ontology and Retrofitting Techniques for Better Distributed Representations of Words and Senses

指導教授 : 陳信希

摘要


隨著自然語言處理工作的需求增加,近年來對於較好的詞分散表示法(詞向量)及詞義分散表示法(詞義向量)的需求在增加當中。在本篇研究當中,我們先探討在詞向量中的不正常維度,然後提出結合詞向量與本體論之模型。結合的方法分為三個部分來討論:直接結合方法,支持向量迴歸方法及利用後適配方法。在詞義向量方面,我們首先提出了能夠利用文本即本體論資訊學習更好詞義向量的聯合詞義後適配模型,並且一般化提出來的模型。

並列摘要


With the increasing number of natural language processing tasks, the need for better representation of words (word embedding) and senses (sense embedding) is getting higher in recent years. In this study, we firstly discuss the problem of abnormal dimensions in word embeddings, and then propose models that combine word embedding with ontology. The combination is discussed in three ways: directly combination approach, support vector regression approach and retrofitting approach. In sense embedding, we firstly propose a joint sense retrofitting model that learns better sense embedding from contextual and ontological information, and then generalize the proposed model.

參考文獻


Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., & Soroa, A. (2009). A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 19–27). Association for Computational Linguistics.
Artetxe, M., Labaka, G., & Agirre, E. (2016). Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 2289–2294).
Azzini, A., da Costa Pereira, C., Dragoni, M., & Tettamanzi, A. G. (2012). A neuro-evolutionary corpus-based method for word sense disambiguation. IEEE Intelligent Systems, 27(6), 26–35.
Banjade, R., Maharjan, N., Niraula, N. B., Rus, V., & Gautam, D. (2015). Lemon and tea are not similar: Measuring word-to-word similarity by combining different methods. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 335–346). Springer.
Bengio, Y., Delalleau, O., & Le Roux, N. (2006). Label Propagation and Quadratic Criterion. In O. Chapelle, B. Schölkopf, & A. Zien (Eds.), Semi-Supervised Learning (pp. 193–216). MIT Press.

延伸閱讀