透過您的圖書館登入
IP:3.145.174.57
  • 期刊
  • OpenAccess

統計式片語翻譯模型

Statistical Translation Model for Phrases

摘要


機器翻譯是自然語言處理研究上最重要的課題之一,在過去運用機器翻譯比較成功的例子,多是特定的領域文件的翻譯。近來因為網際網路與搜尋引擎的盛行,大家開始重視機器翻譯在跨語言檢索(Cross Language Information Retrieval)中的角色。在跨語言檢索的問題上,通常是對查詢字詞或片語,進行翻譯(Query Translation)。然而翻譯的結果必須和欲搜尋的文件庫有高度的相關性,才能達到檢索的效果。目前翻譯查詢關鍵詞的做法,無論是採用現成的翻譯軟體,或者使用一般性的雙語詞典,都很難確保產生和文件相關的翻譯。因此我們希望能夠透過統計式片語機器翻譯(Statistical Phrase Translation Model, SPTM)的做法來進行查詢關鍵詞的翻譯,以提高跨語言檢索的效率。在這篇論文中,我們提出新的統計式片語翻譯模型,並進行實驗。實驗中我們利用BDC雙語電子辭典實驗以SPTM進行片語內的詞彙對應。以SPTM產生對應分析,比較快速,而且正確率比較高。

關鍵字

無資料

並列摘要


Machine Translation is one of the most difficult problems in the field of natural language processing. In the past, MT has been applied to professional communication in the process of translating technical and corporate document on a specific domain. Recent years saw the rapid development of Internet as a new form of communication and information exchange, and the need to access information across the language barrier became apparent. People began to look into the role that MT can play in Cross Language Information Retrieval. The prevalent approach to CLIR is based on translation of query, in particular query phrases. However, for CUR there is an additional new objective of translating into something that is relevant to the collection being searched upon. Therefore, the current approach of using general bilingual word list or an off-the-shelf commercial MT software is bound to be very ineffective in terms of retrieving relevant documents. We propose a new approach to Statistical Phrase Translation Model (SPTM), aimed at achieving a tighter estimation of phrase translation. Experiments were conducted using bilingual phrases in BDC Electronic Chinese-English Dictionary. Preliminary results shows the approach is much faster and produces better word alignment for phrases, which has not been possible using previous approaches.

參考文獻


Behavior Design Corporation=BDC(1992).Behavior Design Corporation.
Jyun-Sheng J. S., J. S.(2001).Proceedings of the Second NTCIR Workshop Meeting on Evaluation of Chinese and Japanese Text Retrieval and Text Summarization (5).
Chen, A.、Gey, F. C.(1997)。Proceedings of the 6th Text Retrieval Evaluation Conference
Jyun-Sheng J. S., J. S.(1998).Proceedings of the third Conference of the Association for the third Conference of the Association for Machine Translation in the Americas (AMTA).
Cocke, J.,Roosin, P. S.,Brown, P. F.,Della, Pietra S. A.,Della, Pietra V. J.,Jelinek, F.,Mercer, R. L.(1988).Proceedings of the 12th International Conference on Computational Linguistics.

被引用紀錄


洪大弘(2009)。基於語言模型及正反面語料知識庫之中文錯別字自動偵錯系統〔碩士論文,朝陽科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0078-0801201511153723

延伸閱讀


國際替代計量