透過您的圖書館登入
IP:3.135.202.224
  • 學位論文

蛋白質交互作用網路之拓樸相似性與基因表現圖譜之相關性研究

Topological Similarity of Protein Interaction Network and its Correlation with Gene Expression Profiles

指導教授 : 邱式鴻
共同指導教授 : 阮雪芬 黃宣誠

摘要


在不同物種之基因體中搜尋功能性基因及鑑定其性質(Protein annotation)是後基因體時代一個具挑戰性的工作。因此開發功能有效的生物資訊方法,以鑑定蛋白質功能是生命科學研究重要的一環。目前已有許多方法利用已有的基因序列(Sequences)、基因表現圖譜(Gene expression profiles)、或蛋白質相互作用數據(Protein interaction)來鑑定未知蛋白質的功能。在此,我們提供一個新觀念,即蛋白質若在蛋白質交互作用的網路拓樸圖譜(Topological map)具有相似度,則它們可能擁有功能類似性。我們提出方法評估蛋白質作用網路中任何蛋白質的拓樸類似度,以找出兩個蛋白質相互作用網路與整體基因表現之相關係數。 爲了量度在這個基因相互作用網路的兩個蛋白質類似度,我們提供了兩個新方法。第一種是最直接相互作用之鄰近蛋白質比較(First neighbor comparison),另一種為拓樸近似法(Topological similarity)。第一種是用來比較這兩種蛋白質與任何第三蛋白質作用圖譜,並根據Pearson correlation coefficients計算相關係數。拓樸近似法分數的計算是根據圖形理論(Graph theory)中兩個圖形頂點的相似度。這些方法能夠給予我們蛋白質互相作用網路的每對蛋白質的一個數目,不管有相互作用或沒有,並且能給予統計評估加權。爲了找出蛋白質相互作用網路與基因表現圖譜之間的關係,我們蒐集大量能夠公開取得的基因微陣列(Microarray)數據,並根據Pearson、Spearman、與Kendall計算表現圖譜而取得每對基因之相關係數。這個相關係數可認為是一種對每對基因表現圖譜的相似度之測量。利用矩陣代表蛋白質相互作用和基因表現圖譜的相似度,我們結合運用矩陣代數和統計方法以獲得蛋白質相互作用網路與基因表現圖譜之間的相關係數。這樣的方法已應用在大腸桿菌、酵母菌、及胃幽門桿菌上。 本實驗結果也顯示蛋白質相互作用網路與基因表現圖譜之間具有正相關性。由這些研究顯示拓樸相似度能夠反應蛋白質之間的相關性。這個嶄新的研究方法在基因體的功能鑑定和蛋白質家族的分類可能提供重要的貢獻。

並列摘要


Search of functional genes among the whole genome and their annotation is one of the most challenging problems in the post-genomic era. In this context, development of meaningful bioinformatics methods of assigning protein functions is important in the study of life science. Many approaches are available for assigning putative functions to un-annotated proteins using information from sequences, gene expression profiles, or protein-protein interaction data. Herein, we propose that the proteins with similar topological map in a protein interaction network may share similar biological functions and demonstrate methods of measuring the topological similarity between two proteins in the entire network and their correlation with gene expression profiles. In order to measure the topological similarity between two proteins in an interaction network, we propose two scoring methods, one is called first neighbor comparison and the other is topological similarity score. The first neighbor comparison method is to compare the interacting profile of two proteins with any other third protein and calculate their Pearson correlation coefficient. The topological similarity score is calculated based on a measure of similarity between graph vertices in graph theory. These methods are able to provide us with a numerical value to each pair of proteins in the network, whether interactive or not, assigning statistical evaluation weights to them. In order to find the correlation between protein interaction network and gene expression profiles, we have collected a large amount of publicly accessible gene expression microarray data and calculated the correlation coefficients of expression profiles between all pairs of genes based on Pearson, Spearman, and Kendall methods. The correlation coefficients of gene expression profiles are regarded as a measure of the expression similarity between a pair of genes. With the matrices representing the topological similarity of protein interaction and the gene expression similarity between each pair of genes and their corresponding proteins, both matrix algebra and statistical methods are combined to obtain the correlation coefficient between gene expression profiles and protein-protein interactions within the proteomes of Escherichia coli, yeast, and Helicobacter pylori. Our results reveal a strong positive relationship between the topological similarity scores of protein interaction networks and the correlation coefficients of gene expression profiles. This indicates that the topological similarity can reflect the functional association between proteins. There are some important applications of this novel approach when used in function annotation of genomes and clustering of protein families.

參考文獻


1. Zhang, M.Q., Promoter analysis of co-regulated genes in the yeast genome. Computers & Chemistry, 1999. 23(3-4): p. 233-250.
2. Harrington, C.A., C. Rosenow, and J. Retief, Monitoring gene expression using DNA microarrays. Current Opinion in Microbiology, 2000. 3(3): p. 285-291.
3. Pellegrini, M., et al., Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. Proc. Natl Acad. Sci., 1999. 96(8): p. 4285-4288.
4. Uetz, P. and e. al., A comprehensive analysis of protein-protein interactions in Saccharomyces crevisiae. Nature, 2000. 403: p. 623-627.
5. Bhardwaj, N. and H. Lu, Correlation between gene expression profiles and protein-protein interactions within and across genomes

延伸閱讀