透過您的圖書館登入
IP:52.15.123.168
  • 學位論文

自動化的資訊擷取:擷取生物學文獻中的蛋白質對蛋白質互動關係

automated extraction of protein to protein interactions from biological literatures

指導教授 : 留忠賢
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


動機:為了提高現在生命科學的數位資料庫的文獻使用的效率, 我們實作自動化的電腦程式系統來判斷文獻中是否有蛋白質對對蛋白質互動的關係, 來輔助生物學的生命遺傳和演化現象的研究。它利用電腦資訊擷取(information extraction)技術, 可以自動擷取數位資料庫中,蛋白質之間互動或影響(protein-protein interactions)的資訊, 並以圖形化的方式呈現互動網路(interaction network)資訊給使用者。 現在科學研究所發現的蛋白質互動資訊,主要是以科學的期刊文獻的形式存放在數位資料庫中。這些資料並不是用電腦方便處理的格式儲存, 也因此一般的生物文獻在利用電腦去搜索蛋白質對蛋白質的互動關係時, 效率低落。因此,現在為了利用電腦去幫助研究人員獲取資料庫中的蛋白質對蛋白質的互動資訊, 我們利用自然語言處理的技巧, 實作一個資訊擷取系統來幫助文獻內容分析。我們的電腦程式, 利用BIND(Biomolecular interaction network database) 資料庫中現有的酵母菌相關蛋白質文獻摘要作測試, 精確度有82%

並列摘要


Motivation: To improve the efficiency of using life science literatures storing in digital databases, we designed software to determine the interactions between proteins, which have been described in the abstracts of life science papers. This program will help the researches in genetics and other life sciences. It is using information extraction techniques and can automatically determine the protein interaction information. Finally, this program outputs protein interaction network graphically to research users. Currently, most protein interactions information was described in life science journals or genomic conferences. However, these data were stored in a format that is convenient for human reading but hard for programmer to implement aided software and difficult to perform query task precisely. Therefore, we take advantage of natural language processing techniques and implement the software that will determine the protein interactions. This information extraction system will help analysis of life literatures and gain protein interaction information automatically. Our program achieves the precision rate in about 82%. And we used the abstracts of yeast papers for program test. These papers for test originally stored in Biomolecular Interaction Network Database (BIND).

參考文獻


4. Google, http://www.google.com
for protein-protein interactions. Vol. 17 no.4 2001 page 359~363. Bioinformatics.
binding relationships from biomedical text, Proc. Of the 6th applied natural
Genome Informatics. 1998 Universal Academy Press, Japan.
extraction: identifying protein names from biological papers. In Proceeding of

延伸閱讀