透過您的圖書館登入
IP:3.137.152.209
  • 學位論文

使用鍊結文法剖析和詞組型樣以擷取蛋白質和蛋白質間的交互作用

Using Link Grammar Parsing and Phrase Patterns to Extract Protein-protein Interactions

指導教授 : 翁昭旼 蔣以仁

摘要


蛋白質和蛋白質間交互作用的資訊在研究分子功能途徑中扮演非常重要的角色,因為蛋白質調控了許多細胞的功能,包括細胞內的訊息傳遞、細胞週期等。現今研究學者藉由閱讀生醫電子期刊以獲得重要資訊,但生醫文獻的數量以非常驚人的速度成長,如果以人工擷取資訊,將會耗費大量的時間和人力。在這篇論文中,我們開發一個可以自動從生醫文獻摘要中擷取蛋白質和蛋白質的交互作用的工具,並提供網頁介面顯示。我們分析鍊結文法剖析器的結果以擷取關鍵字為動詞的蛋白質和蛋白質交互作用資訊,並使用詞組型樣來擷取關鍵字為名詞和形容詞的蛋白質和蛋白質交互作用,或是一些鍊結文法剖析器無法正確剖析的文法。我們歸納鍊結文法的連結成規則,並利用這些規則確認句子中的主詞、受詞、動詞和修飾詞組。由這些句法角色中擷取蛋白質和蛋白質的交互作用。我們的系統以LLL05競賽作評估,取得不錯的效能。

並列摘要


The information of protein-protein interaction is important for discovering molecular pathways. Researchers in molecular biology can understand more knowledge about cellular processes, protein functions and protein mechanisms through information about protein-protein interaction. Many researchers access knowledge about protein-protein interaction through abstracts of biomedical literature, but the amount of biomedical literature is enormous and continues to grow at exponential rate. So studying up-to-date papers and getting useful information is an overwhelming task for researchers. We develop a system which automatically extracts protein-protein interactions from biomedical abstracts. We extract protein-protein interactions whose interaction word is verb by analyzing the result of Link Grammar parser .Our system develops a set of rules which derives from linkages of Link Grammar to identify subject、object、verb and modifying phrases of a sentence, and extracts protein-protein interactions from these syntactic roles. Our system also takes the advantages of manual pattern approach. protein-protein interactions whose interaction keyword is a noun or adjective , or some sentences that link grammar parser can’t parse exactly such as title. The system is tested on LLL05 challenge and achieves better performance.

參考文獻


[1]Jenssen T-K, Lægreid A, Komorowski J, Hovig E(2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics 28,21–28.
[2] Ono,T., Hishigaki,H., Tanigami,A. and Takagi,T (2001) .Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics,17(2),155–161.
[3] Corney, D. P. A., B. F. Buxton, et al. (2004). BioRAT: extracting biological information from full-length papers. Bioinformatics 20(17), 3206-3213.
[4] A Koike, Y Kobayashi, T Takagi (2003). Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource. Genome Research,1231-1243.
[5] Jung-jae Kim, Zhuo Zhang, Jong C. Park and See-Kiong Ng(2006).BioContrasts: extracting and exploiting protein–protein

延伸閱讀