透過您的圖書館登入
IP:13.58.11.68
  • 學位論文

平行搜尋引擎於蛋白質交互作用文獻之應用

The Design of a Parallel Search Engine for Protein-Protein Interaction Literature

指導教授 : 李錫捷
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


由於科技的快速發展,短短十年間,人類由數十個基因的解碼,到整個基因圖譜的逐漸完成,一步步邁向後基因世代,在此新的世代裡,人們所要追問的是DNA中資訊的涵義,及了解蛋白質間的交互作用,為此,掌握相關文獻,並從中獲取所需資訊即成為當前一重要目標,故本研究利用資訊檢索的技術,針對蛋白質交互作用的領域,在龐大的生物醫學文獻資料庫中擷取適當的資訊。 本研究透過統計、整理過去學者提出與蛋白質交互作用相關之識別字,設定一臨界值決定文獻探討主題是否為蛋白質交互作用,利用此再更進一步的以「關鍵頁」代理搜尋引擎的方式,搜尋相關文獻,配合研究中蒐集之蛋白質名稱表示上的特徵,對文章進行處理,且取出關鍵資訊,以此並搭配蛋白質名稱資料庫、蛋白質名稱縮寫等,擴大檢索範圍,並以乏晰理論進行文章相關度比對。 本研究除了以雛形系統落實研究架構外,並與web畫面配合應用,呈現予使用者進行使用及參考;另外本研究亦提出一分散式架構,使得代理搜尋引擎在處理檢索資料上不致有延遲產生,且有助於加快檢索速度。

並列摘要


With the recent accomplishments in the sequencing of the human genome, the next challenge is to determine what to do with this overwhelming piece of information. One of the important tasks in the post-genomic era is to find out protein-protein interactions from the tremendous amount of literatures available. In this study, a parallel search system is proposed for finding protein-protein interaction literatures from the database on the Internet. In this system, we find out discriminating words for protein-protein interaction by way of statistics and results from literatures. A threshold is also evaluated to check if a given literature is related to protein-protein interactions. In addition, a keypage-based search mechanism is used to find out related papers for protein-protein interactions from a given one. Moreover, to expand the search space and ensure better performance of the system, both mechanisms for protein name identification and databases for protein names are used. The system is designed with a web-based user interface and a parallel job-dispatching kernel. Experiments are conducted and the results have been checked by a biomedical expert. The experimental results indicate that by using the proposed system, it is helpful for researchers to find out protein-protein literatures from the overwhelming piece of information available on the biomedical database on the Internet.

參考文獻


【1】 C. M. Deane, L. Salwinski, I. Xenarios, and D. Eisenberg, “Protein Interactions: two methods for assessment of the reliability of high-throughput observations”, Molecular and Cellular Proteomics, pp.349-356, 2002.
【2】 D. Eisenberg, E. M. Marcotte, I. Xenarios, and T. O. Yeates, “Protein function in the post-genomic era”, NATURE, Vol. 405, pp.823-826, 2000.
【3】 W. B.Frakes, and R. Baeza-Yates, “Information Retrieval Data Structure & Algorithms”, Prentice Hall, 1992.
【4】 N. J.Belkin, and W.B. Croft, ”Information Filtering and Information Retrieval : Two Sides of the same Coin ? ” Communication of the ACM, Vol.35, No.12, pp.29-38, 1992.
【7】 陳光華,「資訊檢索技術之核心」,大學圖書館,第三卷,第一期,1999年。

延伸閱讀