以正規邏輯方法解決中文文本蘊含辨識問題

在自然語言處理的應用中，理解自然語言，一直是個很有挑戰的問題。傳統的自然語言處理研究，著重在理解語言的語意與邏輯。而目前自然語言處理的的研究方向，則是著重在用巨量資料和機器學習的方式。雖然這兩種方法各有優缺點，但現今在自然語言處理的研究，傳統的語意學模型則極少被拿出來討論。而目前的機器學習方法，也有其解決問題的極限。若能整合傳統的語意學，和機器學習的方法，是一個值得研究的方向。我們建構一個系統可以用正規邏輯方法解決中文文本蘊含辨識問題。基於形式語意學和計算語意學的理論，我們先用機器學習的方式，將中文文句轉成剖析樹，再用我們提出的演算法，把剖析樹轉成語意表達式。並且，我們提出可以整合外部的知識和語意表達式的方法，並用定理證明的方式，解決中文文本蘊含辨識的問題。再來我們示範，我們的系統可以解決句型較簡單的問題。以及解決現實世界應用問題的可能性與挑戰。最後，我們得出這個系統的優缺點，以及未來可行的研究方向，來改進此系統。

關鍵字

形式語意學；計算語意學；自然語言理解；一階邏輯；中文文本蘊含辨識

並列摘要

In the research of natural language processing (NLP), understanding the natural language is always a challenging problem. Traditionally, the research of NLP focuses on the semantics and logic of natural language. However, the present NLP research trend is focusing on the big data and machine learning techniques. These two methods have their own pros and cons; however, the traditional research of semantics and logic are seldom discussed in the recent works, and the existing machine learning techniques also have their limitations. Combining the traditional works on semantics with machine learning techniques is a good perspective to research. We build a system to solve the Chinese recognizing textual entailment (RTE) problem by formal logic method. Based on the theory of formal semantics and computational semantics, first, we use the machine learning technique to convert Chinese sentences in natural language into syntax trees. Then, we propose an algorithm to convert the syntax trees into semantic representations. Also, we propose a method that solves the RTE problem by integrating external knowledge resources with the proposed semantic representations. With these semantic representations, we can use the theorem proving techniques to solve the problem of Chinese RTE. Then, we demonstrate that our approach can solve some simple cases of Chinese RTE. Also, we show the possibilities and difficulties to solve the real-world cases. Finally, we point out the strengths and weaknesses of our system, and the possibilities on future research to improve our system.

並列關鍵字

Formal Semantics ； Computational Semantics ； Natural Language Understanding ； First Order Logic ； Chinese Recognizing Textual Entailment

參考文獻

[2] Ekaterina Ovchinnikova. Integration of World Knowledge for Natural Language Understanding. Atlantis Thinking Machines. Atlantis Press, 2012.

[3] Donald Davidson. The individuation of events. In N. Resher, editor, Essays in Honor of Carl G. Hempel, page 216 – 234. Springer, 1969.

[4] David R. Dowty. On the semantic content of the notion of ’thematic role’. In Raymond Turner Gennaro Chierchia, Barbara H. Partee, editor, Properties, Types and Meaning, pages 69–129. 1989.

[5] Terence Parsons. Events in the Semantics of English: A Study in Subatomic Semantics. MIT Press, 1990.

[6] Patrick Blackburn and Johan Bos. Representation and Inference for Natural Language. A First Course in Computational Semantics. CSLI, 2005.

國際替代計量

以正規邏輯方法解決中文文本蘊含辨識問題

主題瀏覽