以非監督學習之語義文件向量進行細緻多層面情緒分析

文件的向量表達方式在自然語言處理的許多應用上扮演核心角色。尤其，以非監督學習所得到的一般性向量表達在這些應用中更是一大助益。在實務上，情緒分析是一個縱使困難，卻被認為非常語意層面的的應用，也因此常被用來當作檢測向量品質的工具。目前以非監督方式學習文件向量的方法主要可分為以下兩類:序列式的，他們直接把字彙間的排列順序納入考慮，以及非序列式的，他們不直接考慮字彙間的順序。然而，他們各自都有各自的問題仍待解決。在這篇論文中，我們提出一個模型，可以同時解決這兩種主要方法所面臨的難處。實驗證明我們所提出的方法在常見的情緒分析和同時考量多層面的細緻情緒分析上，都遠遠優於現有的最佳方法。

關鍵字

文件向量；句子向量；非監督學習；情緒分析；語義學習；文字分類

並列摘要

Document representation is the core of many NLP tasks on machine understanding. A general representation learned in an unsupervised manner reserves generality and can be used for various applications. In practice, sentiment analysis (SA) has been a challenging task that is regarded to be deeply semantic-related and is often used to assess general representations. Existing methods on unsupervised document representation learning can be separated into two families: sequential ones, which explicitly take the ordering of words into consideration, and non-sequential ones, which do not explicitly do so. However, both of them suffer from their own weaknesses. In this paper, we propose a model that overcomes difficulties encountered by both families of methods. Experiments show that our model outperforms state-of-the-art methods on popular SA datasets and a fine-grained aspect-based SA by a large margin.

並列關鍵字

Document representation ； Sentence embedding ； Unsupervised learning ； Sentiment analysis ； Semantic learning ； Text classification

參考文獻

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In ICLR.

Google Scholar

Minmin Chen. 2017. Efficient vector representation for documents through corruption. In ICLR 2017.

Google Scholar

R.Kiros,Y.Zhu,R.Salakhutdinov,R.Zemel,R.Urtasun,A.Torralba,andS.Fidler. 2015. Skip-thought vectors. In In Advances in neural information processing systems.

Google Scholar

Q. V. Le and T. Mikolov. 2014. Distributed representations of sentences and documents. In In ICML, volume 14.

Google Scholar

Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In ACL.

Google Scholar

國際替代計量

以非監督學習之語義文件向量進行細緻多層面情緒分析

主題瀏覽