  • 期刊


The Development and Validation of a Conceptual Model for Internet Information Retrieval Agent Based on Conceptual Learning Theory



現今網路中資料的取得多數是借助搜尋引擎來完成,而目前的搜尋系統處理使用者搜尋需求的方式大多採用讓使用者輸入一或數個關鍵字,然後依此關鍵字組進行搜尋。此種搜尋方式其結果並不理想,主要有兩個原因,第一個爲關鍵字本身語意造成的差異,其次爲使用者輸入不適當或意義不完整的關鍵字或字組。本研究爲了提供使用者較適當的關鍵字以改善搜尋結果,建立一個智慧型代理人(Intelligent Agent System)槪念模型,學習使用者有興趣的知識以處理搜尋輔助之工作;並提出以類神經網路的概念來表示個人化關鍵字關聯網路(Personal Keyword Association Network)及整體化關鍵字關聯網路(Global Keyword Association Network),前者儲存與使用者喜好之領域相關之關鍵字的關聯:後者儲存每一個關鍵字在各種領域中與其他關鍵字之關係。根據使用者輸入之關鍵字對個人化關鍵字關聯網路及整體化關鍵字關聯網路做激發以擴充出新的關鍵字,利用此延伸資訊間接輔助系統決定出使用者欲用此關鍵字的哪個語意,提高搜尋結果對使用者的合適度,幫助使用者使用更正確的關鍵字資料進行搜尋,藉以導正搜尋方向,減少使用者反覆搜尋的時問。本研究針對此一智慧型代理人架構中的Global/Personal Page Mining模組,利用實驗的方式加以驗證。在Global/Personal Page Mining模組裡,分別探討在取出文章中的概念時,用來分析關鍵字重要性的TF(上標 *)IDF加權公式、人工選字的差異,最後探討加權公式與人工選字的比較。結論顯示,在人工選字的實驗中,顯示出使用者在對每篇文章關鍵字之認知是有共識的,並且當建議字數增多時,使用者間對選出文章關鍵字的歧見也會越少。在加權公式與人工選字的比較方面,利用Accuracy與Coverage兩個指標選出與人類認知最爲相近的加權公式。


In recent years, a large amount of information is retrieved from Internet by using search engines. A search engine requires one or more user input keywords to carry out a query, but sometimes the searching results are not satisfactory due to the different senses of keywords and the keywords may be not quite in place. In order to expedite the searching process, an intelligent agent system was proposed to simulate the way that humans process information. Using the concept of associative network to represent knowledge structure, this model mimics the long-term and short-term memory of human and also considers the human learning processes. The conceptual model proposed by this study provides a framework for further develop and implement of this intelligent information retrieval agent. In this study, two experiments were conducted to verify the Global/Personal Page Mining Module of an Internet information retrieval agent system based on cognitive learning. The intention of the Global/Personal Mining Module is to find out the representative words in web document. It is expected that the words selected from a document are agreeable with human selection. The results from the first experiment suggest that users have common opinions toward the representative words chosen from each document. If more words were allowed being suggested by the subjects, the higher consensus can be achieved among all subjects. In the second experiment, we used different forms of TF(superscript *)IDF formulas to generate representative keywords from web document and compare the results with human selection in order to find out the better formula that most fit to human's choice. In this experiment, ”accuracy” and ”coverage” were used as the definition for ”fitness”. A better formula was recommended for further development of this information retrieval agent system at the end of this study.
