透過您的圖書館登入
IP:44.222.249.19
  • 學位論文

利用貝氏統計模式進行生物路徑之整合相關性研究分析

Pathway-based Bayesian integrative analysis for genetic association studies

指導教授 : 蕭朱杏

摘要


隨著生物技術的快速發展,越來越多的多基因平台資料(multi-platform genetic data)使得研究人員得以進行多平台的整合分析(integrative analysis)。然而,困難的是如何處理不同平台基因標記物(markers)資料間的關係、以及同一平台內標記物之間的相關性。另外,在關聯性研究中,以基因集合進行的遺傳分析已證實能夠比單一基因檢定(single-marker tests)方法有更高的檢定力(power),因此,如何在整合分析中納入基因集合是目前關鍵的議題。本論文提出一個基於生物路徑的貝氏整合分析模型(Pathway-based Bayesian integrative analysis model, PaBIA model)來整合基因表現量與DNA甲基化兩種平台的資料,同時將生物路徑拓樸(pathway topology-based) 的概念納入模型中。透過後驗分佈的推論,可以在給定的生物路徑中偵測出有影響的基因,並且將他們的重要性進行排序。在模擬研究中,相較於傳統方法,這個PaBIA模型有較低的錯誤發現率(false discovery proportion),及較高的真陰性率(true negative rate),但是在(真陽性率+真陰性率)/2上則較傳統方法略差不到2%。最後,我們使用高程度乳腺管原位癌(high-grade ductal carcinoma in situ)的次世代定序資料以及卵巢癌的微陣列基因資料,透過分析KEGG的多個生物路徑來示範這個統計模型。實際資料分析中被PaBIA排為前幾名重要的基因都曾被文獻報導過與乳癌及卵巢癌的相關性,而且,某些基因已被做為治療乳癌或其他癌症的標靶基因。

並列摘要


The rapid advancement in biotechnology has made the genetic data from multiple platforms accessible for scientists to perform integrative analysis. Challenges arise, however, in dealing with the relationship between data from different sources, as well as the correlation between markers from the same platform. For statistical analysis, current set-based genetic analysis has been shown to exert more statistical power than single marker tests in association studies. Therefore, the incorporation of gene-sets into the integrative analysis has become a critical issue. In this thesis we propose a Pathway-based Bayesian integrative analysis (PaBIA) model to integrate RNA expression and DNA methylation data, simultaneously incorporating the concept of pathway topology to model the relationship between marker values. Based on the posterior inference, influential genes in given pathways can be identified and ranked. Simulation studies confirmed that the proposed model performed better than other traditional approaches, in terms of false discovery proportion and true negative rate. The (true positive rate +true negative rate)/2 of PaBIA is smaller than that of other methods by less than 2%. Finally, we illustrate this approach with a high-grade ductal carcinoma in situ study, and an ovarian cancer study, with KEGG pathways. The top ranking genes have been reported in previous literature to associate with breast cancer or ovarian cancer, and some have even been applied in target therapy.

參考文獻


Baselga, J. (2011). Targeting the phosphoinositide-3 (PI3) kinase pathway in breast cancer. The Oncologist, 16(Supplement 1), 12-19.
Cool, B., Zinker, B., Chiou, W., Kifle, L., Cao, N., Perham, M., et al. (2006). Identification and characterization of a small molecule AMPK activator that treats key components of type 2 diabetes and the metabolic syndrome. Cell Metabolism, 3(6), 403-416.
Daly, R. J., Binder, M. D., & Sutherland, R. L. (1994). Overexpression of the Grb2 gene in human breast cancer cell lines. Oncogene, 9(9), 2723-2727.
Davies, H., Bignell, G. R., Cox, C., Stephens, P., Edkins, S., Clegg, S., et al. (2002). Mutations of the BRAF gene in human cancer. Nature, 417(6892), 949-954.
Dhomen, N., & Marais, R. (2007). New insight into BRAF mutations in cancer. Current Opinion in Genetics & Development, 17(1), 31-39.

延伸閱讀