透過您的圖書館登入
IP:3.147.42.168
  • 學位論文

以去氧核糖核酸作用之專一性及非專一性結合殘基預測結果為基礎進而推論蛋白質序列上蛋白質-核酸結合類型

Prediction of Transcription Factor Domain based on Analysis of Specific and non-Specific DNA-Binding Residues on the Protein Sequence

指導教授 : 黃乾綱
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


蛋白質和DNA的交互作用通常牽涉到DNA的轉錄、複製、遺傳訊息傳送、或是基因重組等重要生化作用。而蛋白質與DNA的結合特性又可分為序列專一性結合以及非專一性結合。序列專一性結合能夠去辨識特定的DNA鹼基對部份;另一方面,非專一性結合主要是與DNA的醣基-磷酸部份進行反應。 本論文第一階段在討論結合殘基預測。對具序列專一性結合殘基的預測,分類預測器能夠達到96.45%的精確度、50.14%的靈敏度、99.31%的專一性、以及81.70%的準確度和高達62.15%的F型測量值;而非專一性結合預測器可達到89.14%的精確度、53.06%的靈敏度、95.25%的專一性、以及65.47%的準確性和高達58.62%的F型測量值。此外,我們將兩項預測結果進行OR運算後,可獲得89.26%的精確度、56.86%的靈敏度、95.63%的專一性、以及71.92%的準確性和63.51%的F型測量值。論文第二階段則探討蛋白質-DNA結合模式的預測,所設計的多類型分類的支援向量機可達到75.83%的精確度。 本論文研究呈現了以序列資訊為基礎的預測分類器,且該分類器能夠針對與DNA結合機制有關的轉錄因子,預測序列專一性結合殘基以及非專一性結合殘基。而發展蛋白質-DNA結合型態的預測器,其目標是希望能夠提供生化學者額外的結構預測資訊,並進一步提升殘基的預測表現。此外,我們也從本實驗中學習相關經驗,將經驗應用在轉錄因子以外的蛋白質類型的結合性殘基預測。

並列摘要


Protein-DNA interactions are essential for fundamental biochemical activities including DNA transcription, replication, packaging, repair and rearrangement. Proteins interacting with DNA can be classified into two modes distinguished by sequence-specific and non-specific binding respectively. Protein-DNA specific binding provides a mechanism to recognize correct nucleotide base pairs namely sequence-specific identification. On the other hand, protein-DNA non-specific binding shows relatively little base-sequence preference and interacts with DNA backbone. In this thesis, we present a two stage Protein-DNA binding prediction. In the first stage of DNA-binding residues prediction, the predictor for DNA specific binding residues achieves 96.45% accuracy with 50.14% sensitivity, 99.31% specificity, 81.70% precision, and 62.15% F-measure. The predictor for DNA non-specific binding residues achieves 89.14% accuracy with 53.06% sensitivity, 95.25% specificity, 65.47% precision, and 58.62% F-measure. In addition, we combine the results of sequence-specific and non-specific binding residues predicted in previous stage with OR operation, and the predictor achieves 89.26% accuracy with 56.86% sensitivity, 95.63% specificity, 71.92% precision, and 63.51% F-measure. In the second stage, a protein-DNA interaction mode predictor is proposed. It can achieve 75.83% accuracy while using support vector machine with multi-class prediction. This article presents the design of a sequence-based predictor aiming to identify the sequence-specific and non-specific DNA-binding residues in a transcription factor with DNA binding-mechanism concerned. The protein-DNA interaction mode prediction was introduced to provide biochemist more structural hint and help improve previous DNA-binding residues prediction. In addition, we will exploit the experiences learned in this study to design binding-mechanism concerned predictors for other types of DNA-contacted proteins.

參考文獻


1. Calkhoven CF, Ab G: Multiple steps in the regulation of transcription-factor level and activity. Biochem J 1996, 317 ( Pt 2):329-342.
2. Pabo CO, Sauer RT: Transcription factors: structural families and principles of DNA recognition. Annu Rev Biochem 1992, 61:1053-1095.
3. Latchman DS: Transcription factors: an overview. Int J Biochem Cell Biol 1997, 29(12):1305-1312.
4. Latchman DS: Transcription factors: an overview. Int J Exp Pathol 1993, 74(5):417-422.
5. Tsuchiya Y, Kinoshita K, Nakamura H: Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins 2004, 55(4):885-894.

延伸閱讀