透過您的圖書館登入
IP:44.211.117.101
  • 學位論文

深度文獻蒐集系統之設計與實驗-結合代理人及搜尋引擎 技術實作之平台

The design and deployment of a vertical literature gathering system –an experimental platform integrating agent and search engine technologies

指導教授 : 陳宗天
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來,能夠自動收集和整合鉅量資訊的系統越來越成熟,然而,這些系統和使用者互動的機制較為不足,以及沒有提供給研究用的開放延伸式平台,例如Google Scholar。我們計畫在這篇研究中開發出一個結合代理人和搜尋引擎的系統 (Meme) 來完成這些工作以為這些議題提供初步的解決方案。系統不但設計的有延伸性和彈性,並且能夠自動的收集和分析研究相關資訊。此外,在實驗中,我們解決兩個問題:系統如何幫助使用者和開發者完成那些工作?系統如何從不同網站來源中整合資訊?此外,Meme整個系統框架的彈性設計和延伸性架構讓開發者容易維護系統和增加其功能。 Meme同時應用JADE代理人和Nutch搜尋引擎技術,一方面,Nutch在系統負責抓取目標內容並且轉換。另一方面,JADE代理人平台利用代理人可以協調系統中各元件、控制搜尋引擎、以及分析蒐集而來的資訊。 研究結果是以網路文獻資訊的擷取率和視覺化查詢的案例分析來顯示,本系統在Web of Science網站上的種子文獻擷取率都超過89%以上,而這個實驗數據也證明了本系統的有效性。此外,本系統提供匯出和視覺化查詢兩個功能給使用者,在匯出部份,使用者可以輸入關鍵字,將所有關於此關鍵字的文章,匯出成一個.net檔案,使用者便可以用Pajek網路分析軟體來對這些資訊分析其中文獻網路關聯,或利用prefuse applet來探索其中文章網路關聯。另一方面,在視覺化查詢部份,使用者可以利用關鍵字查詢看到相關文獻的引文網路,因為近年來文獻資訊的大量增加,學者會想查看文獻間的關聯,為讓使用者能夠更輕易了解關聯網路,其中一個解決方案為視覺化搜尋。 此平台能夠從不同來源收集文獻資訊並且轉換成可容易自動處理的格式,轉換的格式可用來分析或進一步處理。為達蒐集文獻之目的,進行相關分析,滿足使用者需求,系統不但需要分配工作,還需要設定種子連結和環境變數來提供搜尋引擎由目標網站擷取資訊,總而言之,此研究目標是建立一個整合資訊並且和使用者互動的一個可延伸式系統,來幫助使用者處理和整合大量的文獻相關資訊。

並列摘要


In the age of information explosion, a system that is capable of automatically collecting and integrating voluminous information has become indispensable to people. There have been some systems tried to address this need. However, these systems, such as Google scholar, do not have the desired functionalities and do not provide an extensible platform for researchers. We propose a system to address these needs by utilizing the technology of agent and search engine in this research. Meme, our new application for agent models, is the our solution to these issues. In addition, this research carried out two experiments to address two questions: To what extent is Meme system beneficial to users and developer? How does Meme system gather information from various sources? Moreover, the design of Meme provides an extensible and flexible platform for collecting and analyzing research related information. Meme was implemented by applying both Jade and Nutch technologies. On the one hand, Nutch search engine component in the system shows that it can crawl through the intended contents of the web site. On the other hand, Jade agent platform is used to control the Nutch search engine, communicate with other agents, and analyze collected information. In this paper, we presented the results of the literature information extracted from the Web and visualized the search result using our Web application. The extraction rates of seed articles are over 89% from Web of Science. The experimental results tentatively prove that this system is effective. Furthermore, Meme provides the functionalities of exporting and visualization the search results to users. The exporting function of Meme can generate a .net file whose content is retrieved based on the keywords entered by the user. A User can use Pajek software to input the .net file for large literature network analysis. In visualized search, users can see citation network. This platform is capable of aggregating literature information gathered from different sources into a viable format for further process. Furthermore, Meme integrates the agent, JADE, and search engine, Nutch, to achieve its goals. The agent assigns the tasks to search engine and sets up the seed links that search engine uses to crawl. In summary, this study builds an extensible system that is capable of integrating information from the Web and presents the information interactively to users.

參考文獻


3. Bellifemine, F., Caire, G., & Greenwood, D., 2007. Developing multi-agent systems with JADE.
14. Chien Chin, C., Meng Chang, C., & Yeali, S. 2001. PVA: a self-adaptive personal view agent system. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, California.
1. Arms, W. Y., Aya, S., Dmitriev, P., Kot, B., Mitchell, R., & Walle, L., 2006. A Research Library Based on the Historical Collections of the Internet Archive, D-Lib Magazine, Vol. 12, 1082-9873.
2. Bauer, C. & King, G., 2004. Hibernate in action, Manning Publications.
13. Chernov, S., Kohlschutter, C., & Nejdl, W., 2006. A Plugin Architecture Enabling Federated Search for Digital Libraries. LECTURE NOTES IN COMPUTER SCIENCE, 4312, 202.

被引用紀錄


林晏棟(2009)。研究智慧分析工具整合平台設計與實作〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-1408200922522800
楊承豪(2010)。研究智慧平台整合與應用〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-1009201014284900
張志峰(2010)。高彈性引文分析資訊系統架構〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-0308201000561100
高浩修(2011)。引文分析系統的實證研究〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-3108201115132900
黃贇睿(2012)。文獻回顧系統平台效能提昇與使用績效實證研究〔碩士論文,國立臺北大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0023-2707201215400200

延伸閱讀