透過您的圖書館登入
IP:3.144.28.50
  • 期刊

基於深度學習的自然語言處理中預訓練Word2Vec模型的研究

A Study on Pre-trained Word2Vec Model in Deep Learning Based Natural Language Processing

摘要


當利用深度學習以進行語意分析、機器翻譯、文字分類等與自然語言處理(Natural Language Processing, NLP)領域相關應用時,我們需要事先準備及訓練詞嵌入(word embeddings)。在詞嵌入中各詞向量之品質,將直接影響深度學習模型的準確率,本研究旨在探討應用深度學習於自然語言處理時,要如何利用Word2Vec模型來學習詞嵌入的向量表示,經過此預處理步驟後,未來可將這些詞向量輸入深度學習之判別模型,來生成預測結果,以及處理各種有趣的應用。

並列摘要


When using deep learning for semantic analysis, machine translation, text classification and other relative applications in the field of Natural Language Processing (NLP), we must prepare and train the word embeddings in advance. The quality of word embeddings in each word vector will directly affect the accuracy rate of deep learning model. The aim of this study is to explore how to use the Word2Vec model to learn the vector representation of word embeddings when applying deep learning to natural language processing. After this preprocessing step, the word vectors will be input into the deep learning discriminant model to generate prediction results and to handle various interesting applications in the future.

參考文獻


Firth, J. R. J. S. i. l. a. (1957). A synopsis of linguistic theory, 1930-1955.
Leskovec, J., Rajaraman, A., Ullman, J. D. (2014). Mining of massive datasets.: Cambridge university press.
Mikolov, T., Chen, K., Corrado, G., Dean, J. J. a. p. a. (2013). Efficient estimation of word representations in vector space.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J. (2013). Distributed representations of words and phrases and their compositionality. 於「Advances in neural information processing systems」發表之論文,載於 Book Distributed representations of words and phrases and their compositionality,頁 3111-3119。
Qi, Y., Sachan, D. S., Felix, M., Padmanabhan, S. J., Neubig, G. J. a. p. a. (2018). When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

延伸閱讀