透過您的圖書館登入
IP:18.224.38.3
  • 學位論文

以模板導向方法建立蛋白質-蛋白質交互作用家族

Template-driven Approaches for Protein-protein Interaction Families

指導教授 : 楊進木

摘要


將蛋白質分類成家族(family)可幫助研究者更深入瞭解蛋白質功能和彼此間的演化關係。同樣地,因應蛋白質-蛋白質交互作用(protein-protein interaction, 簡稱PPI)資料的快速增加(大部分來自高量高速篩選實驗),研究者為了瞭解新辨識出來的蛋白質-蛋白質交互作用,迫切地需要快速且準確的方法將蛋白質-蛋白質交互作用分類成由同源蛋白質-蛋白質交互作用(homologous PPI)所組成的家族。針對這個議題,我們提出了一個新概念:蛋白質-蛋白質交互作用家族(PPI family),並分別以之建立PPISearch以及SB-HomPPI兩種模板導向方法。PPISearch (http://gemdock.life.nctu.edu.tw/ppisearch)是一個可迅速搜尋蛋白質交互作用家族的工具,同時也能合理地註解未知性質的蛋白質交互作用,這些註解(annotation)的內容包括功能性區塊(domain)和生化功能(biochemical function)。本研究指出,當某蛋白質-蛋白質交互作用與其提問蛋白質對(query protein pair)間具有顯著的序列相似性(BLASTP E-values ≤ 10-40)時,而且該交互作用也已被記錄在大型PPI資料庫(包含來自576個物種的290,137筆PPIs)中,則此交互作用為該提問蛋白質對之homologous PPI。我們的結果顯示,高達88%和69%的功能性區塊及生化功能註解可以合理地由homologous PPI轉移至其提問蛋白質對。 然而仍有兩個問題。其一,我們建立的大型PPI資料庫中每個物種的PPI數量並不平均,少數的物種佔據大量的PPIs紀錄,尤其以酵母菌(yeast)為最。其二,我們使用局部序列比對工具(如BLASTP)尋找同源蛋白質,會偏向具有長序列的功能性區塊,但此區域並不一定參與交互作用。針對這樣的問題,我們結合PPI family的概念和本實驗室先前提出的「立體功能區域交互作用同源性對應(3D-domain interologs)」方法,建立一個新的方法「SB-HomPPI」。SB-HomPPI以異二聚體結構(heterodimer structures)之交互作用界面(interface)做為模板,橫跨多個具有完整基因組的物種(如Integr8資料庫)來辨識一個以結構為基礎之蛋白質-蛋白質交互作用家族(structure-based PPI family, 簡稱SB-PPI family),此家族是由具有相似交互作用界面結構之同源蛋白質-蛋白質交互作用(SB-HomPPIs)所組成。此方法論使用Integr8資料庫,可避免PPI資料庫造成的限制;針對interface,則可以修正BLASTP局部序列比對所造成的誤判。我們的結果顯示,SB-PPI family (94%)與PPI family (86%)在交互作用功能區塊對(domain pair)呈現出很高的保留程度。類似的結果也出現在Gene Ontology (GO)註解對的保留程度分析上,同時也發現SB-PPI family高出PPI family超過30%。綜合以上所述,交互作用功能區塊對及GO註解對在蛋白質-蛋白質交互作用家族是高度保留的生物特性。

並列摘要


Classifying proteins to families provides a description of the functional and evolutionary relationships of proteins. Likewise, as an increasing number of protein-protein interactions (PPIs) become available and high-throughput experiments provide systematic identification of PPIs, there is a growing need for fast and accurate approaches to classify PPIs into families (i.e., a group of homologous PPIs) to understand a newly determined PPI. To address this issue, we proposed a concept "PPI family" to construct new template-driven approaches "PPISearch" and "SB-HomPPI". PPISearch is a tool (http://gemdock.life.nctu.edu.tw/ppisearch) that rapidly identifies PPI family and infers transferability of interacting domains and functions of a query protein pair. We identified homologous PPIs when these protein pairs have significant joint sequence similarity (BLASTP E-values ≤10-40) with the query sequences and were in the annotated database (290,137 PPIs in 576 species). Our results demonstrated that the transferability of conserved domain-domain pairs and conserved function term pairs between query pairs and homologous PPIs are 88% and 69%, respectively. However, we found that the annotated database is dominated by few species, especially yeast, and the method of searching homologs by local alignment (i.e., BLASTP) has a bias in favor of the large domain but that may not involve in binding interface. For these questions, we combined the concept of "PPI family" and our previous study "3D-domain interologs" to construct the approach "SB-HomPPI". The SB-HomPPI identifies structure-based PPI family (SB-PPI family), which is composed of structure-based homologous PPIs (SB-HomPPIs), across multiple complete genomes (i.e., Integr8 database) by using the interfaces of heterodimer structures as templates. This approach uses the Integr8 database and emphasizes the interface to avoid the limitation of the annotated database and the bias of searching homologs by local alignment using BLASTP. Our results presented that SB-PPI family (94%) and PPI family (86%) are highly conserved in interacting domain pairs. Similarly, SB-PPI family was better (at least 30%) than PPI family in conservations of Gene Ontology (GO) term pairs. In conclusion, interacting domain pairs and GO term pairs are the highly conserved biological properties in family.

參考文獻


1 Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Research 38, D211-222 (2010).
2 Wu, C. H. et al. PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Research 32, D112-114 (2004).
3 Lo Conte, L. et al. SCOP: a structural classification of proteins database. Nucleic Acids Research 28, 257-259 (2000).
4 Greene, L. H. et al. The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Research 35, D291-297 (2007).
5 Edwards, A. M. et al. Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends in Genetics 18, 529-536 (2002).

被引用紀錄


黃世明(2018)。影響牙醫診所選購高單價設備之關鍵因素-以水雷射為例〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201800077
謝靜如(2014)。從服務流程再造與服務創新模式之觀點分析出版業之合作模式〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201400672
張鳳菊(2014)。奢侈稅對投資客購屋決策與房地產代銷服務業銷售模式之影響-以桃園房地產業現場銷售觀點〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201400584
尹雯慧(2015)。從創業者網絡關係效益需求觀點分析育成中心服務功能〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/CYCU.2015.00508
陳信任(2015)。從網絡關係效益分析范特喜文化創意聚落平台〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/CYCU.2015.00240

延伸閱讀