  • 學位論文


Combining corpus statistics and knowledge base to disambiguate and acquire verb frames

指導教授 : 張俊盛


在句法中,動詞扮演著舉足輕重的角色。對語言學習者而言,學習相關的動詞架框(verb frame),更是一個重要的課題。 不幸的是,現今大部分的線上字典,例如朗文字典,對於動詞架框都只提供非常粗淺、概略性的表示法,也就是一般列舉的某事(something)與某人(somebody)。然而,這樣粗略的標籤並無法有效的幫助語言學習者。在本論文中,我們提出一個自動習得動詞語意框的方法,並提供更加全面、更容易理解的表示法。我們首先根據從語料庫中得到的句法關係,抽取出與動詞有關的參數並組合出動詞參數組(verb argument tuple),再藉由知識庫(knowledge base)決定各個動詞參數(verb argument)的語意類別。接著透過組合並估計動詞語意組(verb semantic tuple)出現的機率以消除歧義, 最後便得到動詞架框(verb frame) 。我們開發了一個雛形系統 FrameFinder,根據上述方法自動產生動詞語意框,並採用一個人工編輯的動詞型態(verb pattern)字典作為標準答案,評估結果也顯示此方法對於常見動詞,可以得到令人滿意的準確度。


Verb frames are very important for language learners, since they capture the semantics and word usages associated with verbs. Unfortunately, most online dictionaries such as Longman Dictionary show verb frames with broad semantic categories (i.e., something and somebody) which are not very informative. In this work, we introduce a method for automatically generating more comprehensive verb frames. The method involves extracting verb argument tuples based on grammatical relations acquired from a parsed corpus, obtaining intended semantic categories for each argument based on a knowledge base, estimating the probabilities of each semantically labeled tuples, and finally generating verb frames. We present a prototype system, FrameFinder, that applies the method to generate verb frames automatically. Evaluation on a set of verbs with manually compiled semantic patterns shows that the method is able to extract with high accuracy for the important high frequnecies verbs for language learning.


Chen, M. C., & Lin H. (2009). Self-efficacy, foreign language anxiety as predictors of academic performance among professional program students in a general English proficiency writing test. Perceptual and Motor Skills, 2009(109), 420-430.
Chen, M. H., Huang, C. C., Huang, S. T., & Chang, J. S. (2010). GRASP: Grammar-and Syntax-based Pattern-Finder for Collocation and Phrase Learning. In PACLIC (pp. 357-364).
Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational linguistics, 16(1), 22-29.
De Mareken, C.G. "Parsing the LOB Corpus". In Proceedings of the 28th Annual Meeting of the ACL, 1990, pp. 243-251.
Hanks, P. (2004b). The syntagmatics of metaphor and idiom. International Journal of Lexicography, 17(3), 245-274.
