透過您的圖書館登入
IP:3.143.9.223
  • 學位論文

以序列特徵為基礎預測酵母菌轉錄因子間之合作關係

Incorporating Motif Discovery in Investigation of Transcription Factor Cooperativity in Saccharomyces Cerevisiae

指導教授 : 歐陽彥正

摘要


轉錄調控機制發生於轉錄因子與其目標基因之上游啟動子結合之後,因此轉錄因子於轉錄調控上擔當重任。而就經驗觀察上,轉錄因子通常不是單獨調控目標基因之表現,而必須與其他轉錄因子合作,故研究與預測共同合作之「轉錄因子群」為當今重要研究之一。而近期高通量工具技術的進步,例如:染色體免疫沉澱晶片和微陣列晶片等,生物學家將這些技術大舉應用於酵母菌基因體之生物實驗,提供給我們豐富材料而可進行轉錄調控模組相關研究。許多生物資訊研究,無論是單獨使用染色體免疫沉澱晶片,或是結合染色體免疫沉澱晶片與微陣列晶片兩種資料,逐步發展了多種研討「轉錄因子合作現象」的計算方法。然而,基於如是利用基因表現資料的方法論,容易因高度仰賴微陣列晶片的可得性與資料品質優劣而大幅受限;因此,本論文欲探討,針對單獨使用染色體免疫沉澱晶片的方法論而言,能否藉由合併「轉錄因子之結合區特徵探勘及其分析程序」進一步增進此類之現存方法論的預測精準度。於此將所提之新的預測方法描述如下:首先利用染色體免疫沉澱晶片提供之轉錄因子及其目標基因結合資料,辨識每個轉錄因子所屬的目標基因,然後利用本實驗室先前所開發的「序列特徵探勘演算法」,為每一轉錄因子尋找前十名之可能結合區特徵;而後,針對至少共有一個相同目標基因的「轉錄因子對」,計算兩個轉錄因子各自擁有的前十名結合區特徵,總計10x10 = 100組的相似度,再篩選出擁有至少一組相似的結合區特徵配對的「轉錄因子對」;最後,將所預測的「轉錄因子對」依據「兩轉錄因子共有基因程度之評估數值」排序。為了評定方法的成效,我們建構了一組從各種蛋白質資料庫及文獻整理收集的「蛋白質-蛋白質交互作用」和「具有協同作用的轉錄因子對」的答案資料集。而後利用此答案資料集評估各項方法,發現我們所建議之方案優於其他以序列為基礎的方法論;此外,由於本論文之建議方案可同時保留轉錄因子的配對資料與相關的結合區特徵,而可搭起轉錄調控模組與結合區特徵的橋樑,可更為利於建構基因調控網絡。

並列摘要


Transcriptional regulation typically happens after the binding of transcription factors (TFs) to the specific promoter regions of their target genes. TFs frequently regulate gene expression by cooperating with other TFs. Recent advances in high-throughput tools, e.g. Chromatin immunoprecipitation chip (ChIP-chip) and microarray expression data, provides us with considerable information to investigate transcription regulatory modules (TRMs), or groups of cooperative TFs. Many recent studies have developed computational methods to study TF cooperativity by utilizing ChIP-chip data alone or integrating information from both ChIP-chip and microarray data. Since methods employing gene expression information highly rely on the availability and quality of microarray data, this thesis proposes a method named simTFBS, which uses ChIP-chip data alone but incorporating pattern discovery and analysis procedures when finding potential cooperative TF pairs. The proposed method first identifies potential target genes for each TF based on the ChIP-chip data. After that, a previously developed algorithm for predicting TF binding sites (TFBSs) is applied on each TF to derive a top-10 list of potential TFBSs. For a pair of TFs with at least one common target gene, we check whether their top-10 pattern lists share at least one pair of similar TFBSs which suggest cooperativity. Finally, each TF pair is given a score representing the degree of cooperativity defined by the mutual information score between respective target gene lists. In this thesis, the answer set for evaluation is built by collecting known protein-protein interactions (PPI) from databases and annotated synergy relationships from literatures. The results reveal that the proposed approach performs better than many existing methods and also helps to associate a potential TRM with the related TFBSs when constructing gene regulatory networks.

參考文獻


4. Harbison, C.T., et al., Transcriptional regulatory code of a eukaryotic genome. Nature, 2004. 431(7004): p. 99-104.
5. Spellman, P.T., et al., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell, 1998. 9(12): p. 3273-97.
6. Pilpel, Y., P. Sudarsanam, and G.M. Church, Identifying regulatory networks by combinatorial analysis of promoter elements. Nature Genetics, 2001. 29(2): p. 153-159.
7. Banerjee, N. and M.Q. Zhang, Identifying cooperativity among transcription factors controlling the cell cycle in yeast. Nucleic Acids Research, 2003. 31(23): p. 7024-7031.
8. Balaji, S., et al., Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. Journal of Molecular Biology, 2006. 360(1): p. 213-227.

延伸閱讀