  • 學位論文


Definite Abstract Anaphora Resolution in Chinese Texts

指導教授 : 梁婷




指代消解 抽象指代


Anaphora is a common phenomenon in written texts, denoting the use of terms referring the mentioned entities previously. There are pronominal anaphora, zero-anaphora, and nominal anaphora in Chinese texts. The referents can be abstract or entities. In this thesis, we focus on studying definite abstract noun anaphora, and we propose a clause based anaphora resolution procedure. Furthermore, anaphora identification and feature selection are done by using CLINE, CKIP lexical and Google search results etc. The anaphora recognition achieves 90% precision using finite state machine in 1538 instances. Furthermore, we extract four types of features to classify candidate antecedents including position features, distance features, lexicon features and semantic features. These features are used for building SVM classifiers and weighted model on resolving anaphora. The best features set are found by a genetic algorithm. In 241 definite anaphora instances, the SVM classify achieves 40.66% on correct clause position and 68.46% on correct sentence position. The weighted method achieves 42.32% on correct clause position and 70.54% on correct sentence position.


[2] Chih-Chung Chang, Chih-Jen Lin, LIBSVM : a library for support vector machines, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2001.
[8] Nicholas Asher, Reference to abstract objects in discourse, Kluwer Academic Publisher, 1993.
[15] Costanza Navarretta, “Resolving Individual and Abstract Anaphora in Texts and Dialogues”, In Proceedings of the 20th International Conference of Computational Linguistics (COLING), Geneva, Switzerland, pp. 233-239, 2004.
[17] Miriam Eckert, Michael Strube, “Dialogue Acts, Synchronizing Units and Anaphora Resolution”, Journal of Semantics 2000, 17, pp. 51-89, 2000.
[22] Tyne Liang, Shan-Chun Pan, Kwan-His Chen, “Sentence-based Topic Identification and Its Applications in Chinese Texts”, National Computer Symposium, Taipei, Taiwan, 2009.


