透過您的圖書館登入
IP:3.145.166.7
  • 學位論文

應用增長層級式自我組織映射圖於多國語言資訊檢索

A Multilingual Information Retrieval Approach Based on Growing Hierarchical Self-Organizing Maps

指導教授 : 吳美宜
共同指導教授 : 楊新章(Hsin-Chang Yang)

摘要


隨著網際網路上多國語言文件的增加,多國語言資訊檢索技術的應用成為一個重要的研究課題。本文描述我們在發掘多種語言文件上的知識所發展的一個方法。我們從光華雜誌中收集中文與英文的新聞資料,測試語料庫中各有976份中英雙語文件。 在本研究中,我們採用一類神經網路中文件分群的方法,即增長層級式自我組織映射圖,來協助我們發現多國語言文件之關聯。我們使用中英雙語平行語料庫來建構實驗以發掘文件間之關連性。本研究實驗顯示我們的方法可以獲取不同語言文件間之關係。

並列摘要


With the increasing amount of multilingual texts in the Internet, multilingual information retrieval has become an important research issue. This paper describes our work on developing a method for discovery of knowledge from multilingual documents. We collected English and Chinese news articles from the Taiwan-panorama magazine. Our test corpus includes 976 pairs of Chinese-English parallel documents. In this study, we adopt a text clustering approach, which apply a neural network approach, namely the growing hierarchical self-organizing maps (GHSOM), to help us discovering relationships among multilingual documents. We have conducted experiments to uncover relationships of documents based on Chinese-English bilingual parallel corpora. The experimental results show that our multilingual text mining approach may capture conceptual relationships among documents written in different languages.

參考文獻


[53] 許中川、陳景揆 (2001)“「探勘中文新聞文件」,?苳今堨蟆篣穈T管理學會會報, Vol. 14(2), 第103-122頁。
[54] 陳文華、施人英、吳壽山 (2004) “「探討文字採掘技術在管理者知識地圖之應用」,?? 中山管理評論,Vol. 12(6),第35-64頁。
[1] Gordon, R. G. (2005) “Ethnologue: Languages of the World,” Fifteenth edition. Dallas, Tex.: SIL International. Online version: http://www.ethnologue.com/.
[2] Korfhage, R. R. (1997) “Information Storage and Retrieval,” John Wiley & Sons.
[5] Ballesteros, L. and Croft, W. B. (1996) “Dictionary–based Methods for Cross-Lingual Information Retrieval,” Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, pp. 791-801.

延伸閱讀