透過您的圖書館登入
IP:18.119.125.135
  • 學位論文

中文資訊檢索之詞彙資源效益

Effectiveness of Vocabulary Resources in Chinese Information Retrieval

指導教授 : 陳光華

摘要


摘要 本研究以NTCIR第五次檢索會議實驗所提供之標準文件集與問題集為實驗測試環境,挑選問題集內容敘述全為中文之檢索問題進行檢索實驗。同時提供受測者三種不同觀念與技術所產生之詞彙輔助資源:傳統式索引典、統計式索引典、知識本體,經相關詞回饋處理後,供受測者選取查詢擴張詞辭彙。研究結果發現以增進檢索效益之題數來看,統計式索引典詞彙輔助資源提升檢索效益的題目數略多於知識本體輔助資源;但以增進檢索效應之幅度來說,則是知識本體詞彙輔助資源最好。本次實驗發現,傳統式索引典詞彙輔助資源提升檢索效益的表現最差。

並列摘要


Abstract In this study, the effectiveness of different kinds of vocabulary resources for Chinese information retrieval are examined and compared based on interactions between users and the information retrieval system. We use traditional thesaurus, statistical thesaurus, and ontology to carry out a series of experiments for detailed investigation. The NTCIR5 test collection is used as the benchmark, which is composed of topic set, document set, and answer set. In order to make the study much more targeted, 25 queries with Chinese only are extracted and examined from totally 50 queries in NTCIR5 topic sets. The experimental results show that the statistical thesaurus greatly increases the number of improved queries, but ontology greatly increases the retrieval performance. Traditional thesaurus shows the poorest performance among these vocabulary resources. We also find that the users with good experience in information retrieval do well utilize vocabulary resources, and produce good retrieval results. In addition, all vocabulary resources do help Type-II queries, i.e., queries with simple concepts and non-specific temporal and spacial scope.

參考文獻


江玉婷、陳光華 (1999)。TREC現況及其對資訊檢索研究之影響。圖書與資訊學刊(29),36-59。
黃慕萱 (1996)。資訊檢索之五大基本概念探討。圖書與資訊學刊(19),7-21。
黃慕萱(1996)。資訊檢索。臺北市:臺灣學生。
劉群、李素建 (2002)。基於《知網》的辭彙語義相似度計算。中文計算語言學期刊,7(2),59-76。
曾元顯 (1997)。新一代資訊檢索技術在圖書館OPAC系統的應用。大學圖書館,1(3),82-93。

被引用紀錄


王美瑤(2009)。台灣藥用植物及中藥材資訊系統之研究〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2009.02098

延伸閱讀