統計式機器翻譯之雙向詞彙對應暨詞彙連用模式

本文提出一個建立於雙向對應(alignment)和連用(collection)模型的雙向統計式機器翻譯模型(bidirectional SMT model)，讓兩個翻譯方向上的多個單字對應成為可能，它透過允許比對向量指向不同或相同語言句子中的單詞來區分是對應(alignment)或是連用(collection)。在詞組對應方面我們同時間思考這些沒有對應的詞句(unaligned word)，進而判斷這些沒對應的詞句是否應該存在，我們只需要考慮來源端(source)和目標端(target)，它們詞彙的對應(alignment)配對與連用(collocation)配對的關係，來確保最佳詞組對結果。本篇論文我們採用人工標注方法，標注60句翻譯句子對，去判斷比較每組參數設定其結果的正確率，共產生1542個單詞對，其結果正確率最終可達48%。

關鍵字

詞對應；連用；雙向詞彙對應暨連用模式；雙向統計式機器翻譯； EM演算法

並列摘要

In this paper, we propose a bidirectional statistical machine translation model based on bidirectional correspondence, which makes it possible to locate multiple words in two translation directions. It is used by allowing the comparison of the vector to point to the same or different words in the sentence to distinguish between the alignment or collection. In the phrase alignment to the same time we think about these unaligned words, and then determine whether these alignment words should exist. We only need to consider the source and the target the vocabulary of the alignment and collocation relationship, to ensure that the best phrase on the results. In this paper, we use artificial way to mark 60 pairs of translational sentence pairs, to determine the comparison of each set of meta parameters set the correct rate. The results produced a total of 1542 word pairs and the results of the correct rate of up to 48%.

並列關鍵字

Word Alignment ； Collocation ； Bidirectional Alignment-Collocation Models ； Statistical Machine Translation ； EM Algorithm

參考文獻

[Brown 1990] Brown, Peter F., J. Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Frederick Jelinek, John D. Lafferty, Robert L. Mercer, and Paul S. Roossin. 1990. “A statistical approach to machine translation.” Computational Linguistics, 16(2):79–85.

Google Scholar

[Brown 1993] Brown, Peter F., Stephen A. Della Pietra, Vincent J. Della Pietra, and R. L. Mercer. 1993. “The mathematics of statistical machine translation: Parameter estimation.” Computational Linguistics, 19(2):263–311.

Google Scholar

[Callison-Burch 2005] Chris Callison-Burch and Philipp Koehn, “Introduction to Statistical Machine Translation,” Tutorial slides, ESSLLI Summer Course on SMT, ESSLLI 2005.

Google Scholar

[Chiang 2005] Chiang, David, 2005. “A Hierarchical Phrase-Based Model for Statistical Machine Translation,” Proceedings of the 43rd Annual Meeting of the ACL, pages 263–270, Ann Arbor, June 2005.

Google Scholar

[Liang 2006] Percy Liang, Ben Taskar, and Dan Klein, “Alignment by Agreement,” Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pages 104–111, New York, June 2006.

Google Scholar

國際替代計量

統計式機器翻譯之雙向詞彙對應暨詞彙連用模式

主題瀏覽