透過您的圖書館登入
IP:18.226.251.22
  • 學位論文

結合功能與拓樸性質標定蛋白質交互作用網路之新穎蛋白質節點

Combining Functional and Topological Properties to Identify Novel Hubs in Protein-Protein Interaction Networks

指導教授 : 高成炎
共同指導教授 : 黃奇英

摘要


本論文主要針對了生物活性的蛋白質交互作用進行相關的研究,包括設計一個蛋白質交互作用資料庫、分析蛋白質交互作用網路以及預測未知的蛋白質標的。本論文主要包涵了。(1)建構一個物種的全體蛋白質交互作用網路。(2)由一群特定有興趣的蛋白質,建構一個時間與空間相關的蛋白質交互作用子網路。(3)由時間與空間的蛋白質交互作用子網路,推測特定有興趣的蛋白質間潛在的調控過程,並嘗試找尋出相關重要的蛋白質節點。蛋白質交互作用有許多的預測方式,包括基因鄰近位置(Gene neighbor)與相關基因群預測(Gene fusion)、由一組基因演化上的變異推測可能的蛋白質交互作用(Phylogenetic profile)、不同物種間的蛋白質活性區域推測其他物種可能的蛋白質交互作用(Rosetta Stone method)、演化過程的蛋白質序列相似程度推測蛋白質交互作用(Sequence-based co-evolution),以及透過合成致死的方式來推測時間因素、共同表現基因與細胞週期等方式進行蛋白質交互作用的預測。 在一個時間與空間混合的蛋白質交互作用網路中。(1)發展一個蛋白質交互作用資料庫,我們命名為POINeT,主要的功能是蛋白質交互作用的搜尋。我們蒐集了許多不同來源的蛋白質交互作用資料庫的資料,並且利用了蛋白質同源的概念,拓展了各物種間蛋白質交互作用的資訊量。蛋白質交互作用的可信度可依據不同的方式來衡量,例如文獻報導的次數、實驗量測的方式與技術、一群相關蛋白質間直接的交互作用、基因知識以及描述的資訊以及不同物種之間的同源關係。(2)在一個蛋白質交互作用的網路中,存在了許多時間與空間相關因素的子網路。例如,利用POINeT,可以嘗試由蛋白質交互作用網路找尋微陣列晶片中高表現基因間潛在的調控關係以及重要的蛋白質節點。POINeT主要利用了不同的生物註解資訊以及生物網路的拓樸性質來分析一個蛋白質交互作用網路。在生物網路的拓樸性質方面,主要是利用網路中間性質以及網路節點專一性質,兩種性質衡量各節點在蛋白質網路上的重要性。(3)透過生物以及拓樸性質,可以有效的將蛋白質網路中各個節點標定不同的重要性。這些被標定出的未知新穎蛋白質節點可以再進行後續的生物實驗進行驗證。進一步的使用分團(clique)來分析蛋白質交互作用網路,我們可以找尋出蛋白質網路中各自的功能單位,由這樣的功能單位,我們可以知道人類的蛋白質交互作用複合體以及不同物種複合體間保留的程度,利用分團與已知蛋白質複合體的資訊,進而我們可以推論出一個由各複合體間組成的反應路徑。最後,我們利用了跨物種推論的蛋白質交互作用(Interologs)的概念拓展為預測宿主對病原菌間的蛋白質交互作用網路。 我們利用了本論文提出的方式,將POINeT所提供的演算法應用在中心粒蛋白體交互作用子網路。我們找出了5個候選的蛋白質,並且,我們利用了183個已知的中心粒蛋白質,證明了我們找到的 5個候選蛋白質確實存在於中心粒蛋白體交互作用網路中。同時,利用了高緊密度的分團(cliques)推測出來的子網路,不僅顯示了由酵母菌到人類同源複合體的演化同源性質,同時也透過了實驗證明SEPT6共同存在性,並且利用這樣的關係推演出一個潛在而未知的反應路徑,這樣的推演方式,將可以有效的擴展未知反應路徑的預測。最後,我們可以利用POINeT進一步發展和應用在各種不同的蛋白質子網路以及各種內部演算法的改進,並致力於擴大預測和提昇蛋白質交互作用子網路的正確性。

並列摘要


The scope of this doctoral dissertation deals mainly on the protein-protein interactions (PPIs) of various biological questions pertaining to the computational goals of database, mining, and prediction. Specifically, this dissertation assorts respective works towards the indicated PPI studies including (i) constructing spatial- and temporal-composite global network, (ii) mining spatial- and temporal-decomposed sub-network, and (iii) inferring putative biological cascade and unobserved interactome. Towards these goals, various computational PPI prediction methods, including the location feature among gene neighbors and gene clusters; the evolutionary feature of phylogenetic profile, Rosetta Stone method, and sequence-based co-evolution; and the temporal feature of co-expression and cell cycle specified expression, may be integrated as by the gene ontology in order to be exploited by synthetic lethality. In a spatial- and temporal-composite manner, (i) the implemented POINeT database with PPI network display bases on retrieving multiple PPI data sources and extends with putative interologs. The confidences of PPIs are evaluated by literature numbers, experimental techniques, interacted protein queries, gene ontology, and interologs. (ii) Novel hubs among PPI network nodes are likely mined while with the spatial- and temporal-relevant PPI sub-network such as exemplified by the POINeT sub-network fetched with up-regulated microarray genes. The PPI mining for important hubs within PPI sub-network is primarily based on biological features and network topological features in order for hubs prioritization based on the degree of a given sub-network node and degree statistics of given node from randomly sampled sub-networks with equivalent nodes size. The implemented mining algorithms include the centrality indices on all protein nodes in a PPI sub-network as well as the sub-network specificity score on spatial and temporal relevance. (iii) From the verified sub-network with specified relevance and hubs, the pursued PPI prediction is progressively explored with clique analysis on sub-network topology evaluation in the aspects of inferring putative human PPI complexes from known yeast sets within PPI sub-network and predicting inter-species bounding between PPI sub-networks of host and pathogen by interologs analysis with ortholog information. Moreover, (i) the POINeT along with implemented algorithms are applied in PPI sub-networks of mining mitotic midbody sub-network along with predicting mitotic spindle sub-network and inter-species bounding between host and pathogen. The prioritization of previously unobserved 5 candidate proteins of (ii) mining output demonstrates satisfactory consistence with biologically verified 183 midbody proteins despite that one putative protein fused with antigen tag for analytic monoclonal antibody has never been shown to be spatially co-localized at midbody. In addition, (iii) the prediction pursue with highly iterative cliques analyzed on sub-networks not only has revealed the conserved spindle network from yeast to human in a pathway format evidenced by SEPT6 co-localization assay and also has unveiled previously unobserved inter-species interactome of host and pathogen. Further advances and applications with POINeT and assorted in-house algorithms are likely the future works towards expanded mining and accurate prediction on PPI sub-networks.

參考文獻


[1] Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Seraphin B. A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol. 1999 Oct;17:1030-2.
[2] Kuroda K, Kato M, Mima J, Ueda M. Systems for the detection and analysis of protein-protein interactions. Appl Microbiol Biotechnol. 2006 Jun;71:127-36.
[3] Cekaite L, Hovig E, Sioud M. Protein arrays: a versatile toolbox for target identification and monitoring of patient immune responses. Methods Mol Biol. 2007;360:335-48.
[4] Walhout AJ, Boulton SJ, Vidal M. Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast. 2000 Jun 30;17:88-94.
[5] Li S, Armstrong CM, Bertin N, et al. A map of the interactome network of the metazoan C. elegans. Science. 2004 Jan 23;303:540-3.

延伸閱讀