適用於點對點中文語篇剖析的遞迴類神經網路統一架構

中文語篇剖析有四項子任務，包含初級語篇單元分割、剖析樹建立、主次關係識別、語篇關係辨識等。本文展示一個點對點中文語篇剖析器，並提出一套統一架構，可以對輸入之中文篇章直接產生完整的中文語篇剖析結果。我們的剖析器以遞迴類神經網路為基礎，同時對四項子任務進行學習，在中文語篇樹庫（CDTB）資料集上，達到最先進的效能。我們釋出了這個剖析器的原始碼與預先訓練完成的模型，立即可用。據我們所知，這是第一個開放原始碼的中文剖析工具集，而且這套獨立的工具集不須依賴外部資源（如句法剖析器），便於下游應用的整合。

關鍵字

自然語言處理；中文語篇剖析；遞迴類神經網路；篇章結構；基本篇章單元

並列摘要

This paper demonstrates an end-to-end Chinese discourse parser. We propose a unified framework based on recursive neural network (RvNN) to jointly model the subtasks including elementary discourse unit (EDU) segmentation, tree structure construction, center labeling, and sense labeling. Experimental results show our parser achieves the state-of-the-art performance in the Chinese Discourse Treebank (CDTB) dataset. We release the source code with a pre-trained model for the NLP community. To the best of our knowledge, this is the first open source toolkit for Chinese discourse parsing. The standalone toolkit can be integrated into subsequent applications without the need of external resources such as syntactic parser.

並列關鍵字

Natural Language Processing ； Chinese Discourse Parsing ； Recursive Neural Network ； Discourse Structure ； Elementary Discourse Unit

參考文獻

Samuel R. Bowman, Jon Gauthier, Abhinav Rastogi, Raghav Gupta, Christopher D. Manning, and Christopher Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16), pages 1466–1477, August 2016. URL http://www.aclweb.org/anthology/P16-1139.

Google Scholar

Chloé Braud, Maximin Coavoux, and Anders Søgaard. Cross-lingual rst discourse parsing. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17), pages 292–304, April 2017. URL http://www.aclweb.org/anthology/E17-1028.

Google Scholar

Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Proceedings of the Second SIGdial Workshop on Discourse and Dialogue (SIGDIAL’01), pages 1–10, 2001. . URL https://doi.org/10.3115/1118078.1118083.

Google Scholar

C. Goller and A. Kuchler. Learning task-dependent distributed representations by backpropagation through structure. In Proceedings of the 1996 IEEE International Conference on Neural Networks, volume 1, pages 347–352 vol.1, Jun 1996. .

Google Scholar

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Comput., 9(9):1735–1780, November 1997. ISSN 0899-7667. . URL http://dx.doi.org/10.1162/neco.1997.9.8.1735.

Google Scholar

國際替代計量

適用於點對點中文語篇剖析的遞迴類神經網路統一架構

主題瀏覽