透過您的圖書館登入
IP:3.14.72.80
  • 學位論文

文件資料之概念主題檢索

On Study of Document Retrieval Based on Concept Indexing

指導教授 : 姚修慎
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文研製之目的是為了提升全文文件資料的檢索效率,以節省資訊需求者浪費在資訊重複篩選與過濾上的時間與成本。現今資訊的傳遞藉由網際網路的盛行,已呈現爆炸性的成長,各種不同知識領域的資訊正透過網路廣泛地互相傳遞,此時全世界的資訊都藉由著文字、影像與聲音的型態在不斷的接受與傳遞;這其中又以文字資訊所佔的比例最高,文字以其特性記錄著人類以自然語言描述的概念細節,在如此不具有結構性質的文字描述中,我們要如何的利用檢索的方式對這些非結構性的資訊進行檢索,取出個人需求的資訊?這個問題正如同在廣如大海般的資料流中尋找個人所需的資訊一般的困難。為了針對具備非結構性的全文資料進行有效的檢索,我們提出了一個有別於傳統關鍵字檢索的方法論,我們稱之為概念檢索。 概念檢索的核心是以概念分類為主要的重點,除此之外,為了配合概念檢索的執行,我們使用多維向量空間的方式來進行文件相似度的計算,此時的文件概念相似度就不再只是計算關鍵字的頻率而已,概念的相關、擴展與收縮亦可藉由空間向量距離的計算而得到。配合著概念分類的系統設計還加上了概念主題檢索的功能,透過視覺化的雷達圖介面,可導引檢索者以視覺式的方式進行概念的調整,以輔助檢索者對檢索需求進行最佳化的動作,務求第一次檢索就能達成任何一位檢索者的需求,提供對檢索者有效的資訊。 系統的實作建置是以網際網路上的電子報作為實驗對象,當然本研究所強調的是方法論的提出,任何全文文件資料皆可應用此方法論而進行概念全文檢索。此研究結果證實了概念檢索可以根據資訊檢索者的不同需求情況而在檢索回覆上給予適當的記憶能力與精確能力。除此之外,概念主題檢索功能的提供,使的資訊需求者的檢索概念能與資訊系統所提供的概念配合,強化了檢索的有效性。

並列摘要


A procedure is studied for the purpose of query efficiency improvement for text data, to save the time and cost for who eager for information. Nowadays, information is no longer limited by area due to the blooming usage of Internet. Information is propagated widely via Internet in the format of voice, picture and text. Compare to other format, text data is the major usage to carry on the cable communication in human society. However, the concept of the description of using text is lacks in precision compare to the traditional database which use the “tuple” to record the data precisely. In order to have the efficient query during information search in text data, this study propose a methodology named concept indexing different to the traditional skill of text indexing which usually take times to re-screen the information during query. Concept category is the core of concept indexing. All the keyword will be transferred from the term space to concept space, and the document similarity will be then calculated in the concept space using the theory of Euclidean distance in vector space. This usage of vector space will bring the function of the relation, contraction and dilation between concepts. Base on the category of concept, the study also involve the idea of visual text mining to address the subject in the concept, trying to help the information buyer to get the useful target in the first time of query. The Internet news was used to implement the system; different kind of text data source can be adapted to the system since the methodology is proposed. The experiment results of this study show that the concept indexing can adjust the ability of recall and precision according to the requirement of information buyer. And the subject of concept for both information buyer and query system can be matched by the using of concept indexing skill.

參考文獻


2.O''Neil, P., “An incremental approach to text representation, categorization, and retrieval “, Proceedings of the Fourth International Conference on Volume: 2 , Page(s): 714 -717 vol.2, 1997
3.Kawahara, M.; Kawano, H., “An application of text mining: bibliographic navigator powered by extended association rules “, Proceedings of the 33rd Annual Hawaii International Conference on , Page(s): 516 —525, 2000
5.Rose, S.J., ” The sunflower visual metaphor, a new paradigm for dimensional compression”, IEEE Symposium on , Page(s): 128 —131, 1999
7.Eleanor D. Dym , Subject and Information Analysis , Marcel Dekker Publishers, New York ,1985.
1.Elisa Bertino, Indexing Techniques For Advanced Database Systems , Barbara Catania Kluwer Academic Publishers, Massachusetts, 1997.

被引用紀錄


鄭儒鴻(2006)。以RFID技術為基礎之書籍推薦系統〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2006.00018
詹岳縉(2009)。網路多文件摘要整合及呈現〔碩士論文,長榮大學〕。華藝線上圖書館。https://doi.org/10.6833/CJCU.2009.00018
何新興(2009)。從知的權利論政府資訊檢索系統之法制化〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2009.10361
邱鴻昌(2007)。應用關聯式規則與分群技術建構玩具及兒童相關產業知識管理雛型系統之研究〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-2307200720155800

延伸閱讀