透過您的圖書館登入
IP:3.15.156.140
  • 學位論文

從現有蛋白質-DNA複合物結構能量分析觀察保留性胺基酸如何影響鍵結能量

Investigating Influence of Conserved Residues on Binding Specificity Based on Existing Protein-DNA Complexes

指導教授 : 陳倩瑜

摘要


大部分基因組已經被定序出來的物種,還是有超過半數以上的基因其功能是我們所不知道,一般來說,基因透過表現成蛋白質實現它的功能,而基因表現成蛋白質的調控作用則是會受到轉錄因子(transcription factor)的影響,這些轉錄因子有能夠辨識目標基因(target gene)的功能;有些資料庫提供這些轉錄因子來自實驗上得到的相關資訊,這樣的資料庫被廣泛應用在尋找可能被轉錄因子辨識的DNA序列,不過實際的實驗因為成本耗費高,以致於準確預測目標位置(target site)的應用性也降低,顯然我們還是需要更多資訊工程模擬的相關技術加以輔助,以便更快速進行大規模的基因組分析。近年來,蛋白質-DNA複合物的結構資料量日益增大,相對給了我們很多關於DNA 鹼基跟胺基酸之間特定交互作用(interactions)的資訊。但是觀察這些結構所提供的訊息後會發現,鹼基跟胺基酸之間並沒有簡單的一對一對應關係,他們之間的交互作用在空間中是很多樣性的,同樣一對胺基酸跟鹼基產生交互作用後,有可能會造成不只一種的幾何關係出現,但在仔細觀察結構資料後,科學家也發現某種互動模式,的確可以在其他經常發生的蛋白質-DNA的交互作用上發現相同情形。過去研究經驗顯示,胺基酸能夠區分不同鹼基來進行結合的行為,這些規則被觀察到可以適用於某些會鍵結在特定DNA序列上的蛋白質家族;但如果這些規則的應用範圍太過於廣泛,仍會出現相當多的例外。此論文嘗試以能量的觀點出發,利用現有的能量計算軟體,Rosetta,透過蛋白質資料庫(Protein Data Bank, PDB)中大量的結構資料,將我們篩選過的結構資料作為Rosetta的輸入檔,利用Rosetta中許多分子模擬模式的其中一種RosettaDesign,針對輸入檔中每一個原子的三維座標進行能量計算,我們可以分別得到雙股螺旋DNA上的鹼基或蛋白質序列的胺基酸被替換後在相同複合物中的能量表現,同時,也能對複合物整體的最佳能量表現以及結構圖形做演算。我們藉由計算蛋白質-DNA複合物中胺基酸或鹼基改變前後的複合物總能量,配合保留性胺基酸的篩選,將能量有無劇烈改變與是否為保留性胺基酸做為兩大特性,利用統計方法評估與DNA結合的保留性胺基酸對於蛋白質跟DNA鍵結時的影響。

並列摘要


The functions of about half of the genes are still unknown for most species of which the genomes are already sequenced. Many of these genes encode transcription factors (TFs) which can recognize target genes. Some databases provide information about transcription factor binding sites from experiments. Such data are generally used for searching DNA sequences targeted by the transcription factors. However, these experiments consume much cost and human power. Obviously, it is necessary to combine as much information as much as possible to discover TF targets. Increase of data on the structures of protein-DNA complexes provide us with more information about the specific interactions between DNA bases and amino acids. It is observed that there is no simple one-to-one correspondence between bases and amino acids. The type and orientation of the interactions are extensively spread in space. More than one possible geometric relationship between bases and amino acids geometry would be generated.The objective of this thesis aims to investigate the effects of conserved residues on binding specificity from the aspect of energy. We use the program, Rosetta, to estimate the binding energy of protein-DNA complexes. By exploiting the data from Protein Data Bank (PDB), we put the coordinate information of each atom as the input to Rosetta. Rosetta can refine the 3-dimension coordinates of all the atoms from the PDB file for optimal energy performance. After calculating the energies of the complex before and after the mutations of bases or residues we define every base as causing remarkable or non-remarkable energy changes to complex. With conserved and non-conserved residues in the same complex we have two types of characters, conserved and non-conserved residues, remarkable base and non-remarkable base. Finally we employ statistical analysis to evaluate the influence of conserved residues on binding specificity.

參考文獻


Ahmad, S., M. Gromiha, et al. (2004). Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information, Oxford Univ Press. 20: 477-486.
Blom, N., S. Gammeltoft, et al. (1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites." Journal of molecular biology 294(5): 1351-1362.
Glaser, F., T. Pupko, et al. (2003). ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information, Oxford Univ Press. 19: 163-164.
Havranek, J., C. Duarte, et al. (2004). "A simple physical model for the prediction and design of protein-DNA interactions." Journal of molecular biology 344(1): 59-70.
Hogan, M. and R. Austin (1987). "Importance of DNA stiffness in protein-DNA binding specificity." Nature 329(6136): 263-266.

延伸閱讀