透過您的圖書館登入
IP:18.217.138.169
  • 學位論文

利用網路搜尋搭配詞翻譯

Mining Bilingual Collocations on the Web

指導教授 : 張俊盛

摘要


本論文描述一以擴展查詢為本之方法,自動學習透過有效擴展查詢,自網路語料中擷取搭配詞之翻譯。此方法對一給定之中文搭配詞,自動學習擴展查詢之詞彙,用以透過搜尋引擎搜尋翻譯。在訓練階段,我們利用平行語料取得語料翻譯及訓練擴展查詢詞彙,並利用網路資源驗證其有效性。在執行階段,輸入之中文搭配詞自動轉換成一組查詢字串,並傳送至搜尋引擎,再擷取候選翻譯,最後利用相似度過濾候選翻譯及排序,並呈現可能的翻譯。除了平行語料庫含有的翻譯外,本方法可在網路搜尋到更多參考翻譯。實驗結果顯示,不論對第二外語學習者、翻譯者、機器翻譯系統都有所幫助。

並列摘要


In this paper, we introduce a new method for learning to find translation equivalents of a given collocation on the Web based on the query expansion strategy. Our approach involves finding translations in a parallel corpus and learning query expansion terms for the given collocation in order to bias search engines towards returning the top-ranked snippets containing sought-after translations. We utilized the corpus translations from parallel corpus and attempt to learn additional QE terms for retrieving more translations on the Web. The query expansion method is trained on a parallel corpus and validated on the Web. At run time, a given collocation is automatically transformed into a set of queries and sent to a search engine. Then candidate translations are retrieved from the returned snippets and ranked according to their similarity with respect to the corpus translations. Our method provides significantly more translation equivalents from the Web in addition to translations found in parallel corpus, which could be used to assist language learners, translator, and the development of machine translation systems.

參考文獻


Eugene Agichtein, Steve Lawrence, Luis Gravano. 2004. Learning to find answers to questions on the Web. In ACM Transactions on Internet Technology (TOIT), 4(2):129-162.
Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46.
Eric J. Glover, Gary W. Flake, Steve Lawrence, William P. Birmingham, Andries Kruger, C. Lee Giles, David M. Pennock. Improving Category Specific Web Search by Learning Query Modifications. In Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001), 23-31.
Adam Kilgarriff and Gregory Grefenstette. 2003. Introduction to the Special Issue on the Web as Corpus. Computational Linguistics, 29 (3):333-347.
Julian Kupiec. 1993. An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, 17-22. Columbus, Ohio.

延伸閱讀