透過您的圖書館登入
IP:3.17.6.75
  • 學位論文

中文時態自動標記及其在因果論元偵測之應用

Automatic Chinese Tense Tagging and Its Application to Causal Effect Detection

指導教授 : 陳信希

摘要


因果分析在自然語言處理扮演重要的角色,其應用如事件抽取、因果推論、和問題回答。本論文探討時態資訊在中文因果分析中的作用,提出一套半監督式模型自動標記中文時態,並將時態資訊作為特徵,應用於因果篇章分類與因果論元識別。 英文有文法上的時態資訊,透過動詞變化與助詞組合來呈現不同時態。然而,中文的動詞並不會因時態不同而有所改變,必須依靠周圍的搭配詞判斷,因此在預測上較英文困難。在本論文中,我們提出一個半監督式的學習策略,從UM-Corpus中英平行語料庫中,制定規則從英文端取得時態,再經由中英詞彙階層對齊,將時態標記於中文端上。自動產生大量具有時態標記的中文語料之後,我們以此提升依存卷積神經網路模型用於中文時態預測的效能。 最後,我們將時態分類模型所預測出的時態做為特徵加入因果類篇章關係分類實驗、與因果類篇章原因與結果識別實驗,同時也使用人工標記的正確時態分析時間特徵為分類所帶來的影響。實驗結果顯示,句間的時態轉變對於不同的因果關係,存在不同的使用習慣,加入時態資訊為特徵之後,顯著地提升因果篇章分類與因果論元識別之準確率。

關鍵字

因果推論 因果分析

並列摘要


Causal analysis is an attractive topic in natural language processing and can be aplied in a variety of tasks such as event extaction, causality inference, and question-answering. This thesis explores the role of tense information in Chinese causal analysis. A semi-supervised approach is proposed for Chinese tense labelling. Both tasks of causal type classification and causal directionality identififcation are experimented to show the significant improvement gained from tense features. Unlike English, which has grammatical tense information, it is more challenging to predict the tense of a Chinese verb. Based on English-Chinese parallel data from UM-Corpus, we propose an approach that automatically aligns the tense information from English sentences to their Chinese counterparts. The large amount of pseudo-labelled Chinese tense instances are used to train the Chinese tense predictor. Our semi-supervised approach improves the dependency-based convolutional neural network (DCNN) models for Chinese tense labelling. Finally, the Chinese tense information is used as features for the tasks of casual type classification and causal directionality identification. Experimental results show the tense features significantly improve the performances of both tasks.

並列關鍵字

Causal Analysis

參考文獻


Chikara Hashimoto, Kentaro Torisawa, Julien Kloetzer, Motoki Sano, Istvan Varga, Jong-Hoon Oh, and Yutaka Kidawara. 2014. Toward Future Scenario Generation: Extracting Event Causality Exploiting Semantic Relation, Context, and Association
Yancui Li, Wenhe Feng, Jing Sun, Fang Kong and Guodong Zhou. 2014. Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure. Empirical Methods in Natural Language Processing (EMNLP), pages 2105–2114.
Feifan Liu, Fei Liu and Yang Liu.2011. Learning from Chinese-English Parallel Data for Chinese Tense Prediction. International Joint Conference on Natural Language Processing, pages 1116–1124.
William Mann and Sandra Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243-281.
Dong Nguyen, Tijs van den Broek, Claudia Hauff, Djoerd Hiemstra, and Michel Ehrenhard. 2015. #SupportTheCause: Identifying Motivations to Participate in Online Health Campaigns. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2570–2576, Lisbon, Portugal.

延伸閱讀