  • 學位論文


An Interactive Computer-Aided Translation and Writing Assistant

指導教授 : 張俊盛


本論文提出一個建議接續譯文之文法和翻譯的方法,以期減輕語言學習者在翻譯時選字的負擔、減少學習者文法或片語使用的錯誤、進而提升寫作品質,尤其是所謂在寫作質效(productivity)上。在我們的方法中,這些包含字詞使用樣式(pattern)的文法還有翻譯建議是即時產生,並且跟線上寫作互動平台整合為一,不需要另外開啟其他的查詢頁面或是仰賴其他系統。方法上包含自動為未知詞抽取、組合翻譯候選,自動分析目標語語料以擷取出文法概念為本(syntax-based)的字詞使用樣式、或字詞使用傾向(phraseological tendencies),另外也自動抽取雙語翻譯配對以幫助譯文文字預測(text prediction)。執行時期,原文和目前系統使用者現階段輸入的譯文將會被切成n-grams以產生接續譯文的文法字詞使用和翻譯建議。這些建議將即時(real time, on the fly)被評估、排序並整合傳送給系統使用者作為提示。我們將此方法實作成一個雛型系統TransAhead,並將其應用在電腦輔助翻譯和電腦輔助寫作上,或甚至在電腦輔助語言學習上。實驗結果顯示我們的未知詞模組為系統未知詞提供可接受的翻譯候選並且減輕現存翻譯系統中未知詞帶來的負面影響(週遭選字與排序問題),而寫作建議模組(亦或是字詞使用樣式模組)則對語言學習者在寫作上有明顯的幫助,尤其是在冠詞和介係詞的使用上。整體評估發現本論文所提出並實作的TransAhead雛型系統所提供的譯文和寫作建議在翻譯和寫作上有相當大的潛力,因為平均而言系統使用者在翻譯的表現(利用機器翻譯自動評分準則—BLEU)上皆有顯著的提升。


We introduce a learning method for predicting text completion in writing, and grammatical constructions to assist in the translation of a source text. In the proposed approach, predictions are offered on the fly during sentence translation to help the user in making appropriate lexical and grammar choices, thus improving writing quality and productivity. The method involves automatically extracting and evaluating sublexical/constituent translations for out-of-vocabulary (hereafter referred to as OOV) words (i.e., out-of-vocabulary module for text prediction), automatically analyzing target-language sentences to generate general and syntax-based phraseological tendencies (i.e., target-language writing suggestion module for grammar prediction), and automatically learning high-confidence word- or phrase-level translation equivalents (i.e., text prediction). At run-time, the source text and the translation prefix entered by the user are broken down into n-grams to generate grammar and translation predictions, which are further combined and ranked via translation and language models. These ranked prediction candidates are then displayed to the user in a pop-up menu as translation or writing hints. We present a prototype writing assistant, TransAhead, that applies the method to a human-computer collaborative environment for computer-assisted translation and computer-assisted language learning. Experimental results show that the OOV module indeed provides good translations for unknown words, and eases the impact of OOV on translation quality. It was also found that language learners substantially benefit from the writing module’s phraseology information. Overall, our methodology supports inline text and grammar predictions and has great potential for assisting language learners or novice translators in the process of translation, writing or even language learning.


Joshua S. Albrecht, Rebecca Hwa, and G. Elisabeta Marai. 2009. Correcting automatic translation through collaborations between MT and monolingual target-language users. In Proceedings of the European Chapter of the Association for Computational Linguistics, pages 60-68.
Karunesh Arora, Michael Paul, and Eiichiro Sumita. 2008. Translation of unknown words in phrase-based statistical machine translation for languages of rich morphology. In Proceedings of the SLTU.
Ming-Hong Bai, Keh-Jiann Chen, and Jason S. Chang. 2008. Improving word alignment by adjusting Chinese word segmentation. In Proceedings of the International Conference on Natural Language Processing, pages 249-256.
Morton Benson, Evellyn Benson, and Robert Ilson. 1986. The BBI Combinatory Dictionary of English: A guide to word combinations. Philadelphia: John Benjamins.
Yunbo Cao and Hang Li. 2002. Base noun phrase translation using web data and the EM algorithm. In Proceedings of the International Conference on Computational Linguistics.
