  • 學位論文


Learning to extract Bilingual Grammar Patterns

指導教授 : 張俊盛


本論文提出了一個利用序列標註模型自動化辨識英文文法規則,以及擷取同步雙語文法樣式的方法,可用於協助語言學習。在我們的方法中,我們將英文例句轉換為標記著文法規則符號的字集,作為序列標註模型的訓練資料。我們的方法包含了訓練一個序列標註模型來自動化辨識英文文法規則,產生人工標記資料,建立單詞翻譯表,以及設計一個利用標記資料和翻譯表來擷取雙語文法樣式的方法。在執行時,系統會依使用者查詢的單字,顯示根據使用頻率排序過的中英同步文法樣式,以及相關例句。我們提出了一個網站雛形 FamiliarPatterns ,幫助語言學習者學習正確的單字文法規則。我們使用隨機選取的例句進行初步評估,實驗結果顯示我們的方法有著不錯的準確性。


We introduce a method for automatically identifying English grammar patterns using sequence labeling and extracting bilingual Synchronous Grammar Patterns (SGPs) to assist language learning. In our approach, English sentences are transformed into a set of words marked by grammar pattern labels, aimed at training a sequence labeling model. The method involves training a model to automatically identify English grammar patterns, generating annotated SGP data, creating a phrase table, and developing a method for extracting SGPs using phrase table. At run-time, queried words are submitted, and suggestion is performed on the corresponding synchronous grammar patterns of English and Chinese and the example sentences retrieved by frequency. We present a prototype, FamiliarPatterns, which applies the method to assist learners to adhere correct word usage. Blind evaluation on a set of randomly sampled sentences pairs shows that the method performs reasonably well.


Glenn Carroll and Eugene Charniak. Two experiments on learning probabilistic dependency grammars from corpora. Department of Computer Science, Univ., 1992.
Jim Chang and Jason S Chang. Writeahead2: Mining lexical grammar patterns for assisted writing. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 106–110, 2015.
Timothy Dozat and Christopher D Manning. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734, 2016.
Chris Dyer, Victor Chahuneau, and Noah A Smith. A simple, fast, and effective reparameterization of ibm model 2. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 644–648, 2013.
Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A Smith. Recurrent neural network grammars. arXiv preprint arXiv:1602.07776, 2016.
