概念網是個把我們生活中的知識用更方便計算的方式表示的語意網 路,概念網特別的地方在於它上面的點代表的不是傳統詞彙,而是一 個更高階的語意概念,這些概念可能是由組合多個詞彙而產生的,例 如’ 吃午餐’ 或是’ 充飢’,也因此這些點被稱為’ 概念’,而兩個’ 概念’ 間會用有向邊來連接,每個有向邊會對應到一個關聯種類來表示這兩 個概念的語意關係,這些有像邊稱為’ 關係’。 情緒分析是為了辨識出文本隱含的情緒和意見,目前許多方法都會 利用詞或是片語的情緒資訊來幫助它們分析文本,但是在中文情緒分 析上,現有資源的詞彙涵蓋量還是有限。因為概念網上面有很多點且 具有高階的語意概念,後來一些方法也提出幫概念網上的概念預測情 緒,以此增加中文資源的涵蓋率。這些方法是對每個概念賦予一個情 緒,但是我們知道一個概念在不同情況下可能會有不同情緒,例如’ 尖 叫’ 和’ 突然’。本研究便是希望能抽取出中文概念網中隱藏的情境資訊 來幫助我們得出每個概念在不同情形下的情緒是如何。為了達到這個 目標,我們提出了一個主題感知的情緒擴散系統,系統中我們提出用 隱含狄利克雷分布來把中文概念網切成不同的主題層,以對各個主題 層分開傳遞情緒值的方式來預測每個概念在不同主題下的情緒值。我 們利用這些概念的主題感知情緒來幫助後續的文本極性分析,我們提 出觀察其他一起發生的概念來辨識主題,並根據主題來幫文本中出現 的概念選取合適的情緒值。本研究設計三個實驗來衡量主題感知預測 的效果,實驗結果顯示抽取主題對預測概念的情緒和預測測試文本的 極性都能帶來效果上的進步。
ConceptNet is a semantic network which stores general knowledge into more computable representations. In ConceptNet, the notion of a node is ex- tended from purely lexical terms to include higher-order compound concepts, e.g., ’eat lunch’, ’satisfy hunger’, and these nodes are called concepts. There may be a directed edge connecting two concepts. Each directed edge is as- sociated with one of the predefined relation types to represent the semantic relation between two concepts it connects, e.g., ’CapableOf’, ’Causes’. These directed edges are called relations. Sentiment analysis aims to identify the attitudes or emotions behind texts. For most approaches, sentiment information of terms or phrases is an im- portant resource. However, in Chinese sentiment analysis, the coverage of available resources is still limited. To increase the coverage, some methods has been developed to predict sentiments for nodes in Chinese ConceptNet due to its large size and high semantic level nodes. Current approaches aim to assign one sentiment to each concept, but in fact a concept may have different sentiments on different contexts, such as ’ 尖叫’ and ’ 突然’. In this thesis, we aim to extract the hidden contextual information in Chinese ConceptNet and use it to estimate sentiments in different situations for each concept. To achieve this goal, we design a topic-aware sentiment propagation system. We propose using Latent Dirichlet Allocation to divide Chinese ConceptNet into different topic layers. On each topic layer we perform sentiment propaga- tion through some types of relations to predict sentiments on the topic for concepts. Our another goal is to use the generated topic-aware sentiments of concepts to improve the polarity classification for texts. We propose combin- ing other co-occurring concepts to identify topics and select proper sentiments for concepts in texts. This thesis conducts three experiments to investigate the effect of topic-aware sentiment prediction. The results show that extracting topic is helpful in both predicting sentiment values of concepts and predicting polarities of our test texts.