透過您的圖書館登入
IP:3.135.190.101
  • 學位論文

蛋白質內含子之預測

The Prediction Of Intein Sequence

指導教授 : 陳中明
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


蛋白質內含子(intein)是一種寄生基因,類似於去氧核醣核酸中的內含子(intron)。它會跟著宿主基因一同進行轉錄跟轉譯反應,在轉譯完成形成蛋白質序列後,它會產生自發性的反應,會自宿主蛋白中自切(self-splicing)出來,並且它不會對宿主蛋白原有的功能造成影響,因為前期的寄生以及在形成蛋白質後的自切行為,故命名它為蛋白質內含子(intein)。目前有超過三百筆的蛋白質內含子資料被蒐集並且公佈在蛋白質內含子資料庫中(Perler, 2002 Inbase)。由資料裡我們可以發現蛋白質內含子它遍佈於構成所有生物種類的三域系統(細菌、古細菌、真核生物)中,不管是哪種生物系統,我們均可以發現蛋白質內含子的存在。而百分之七十的蛋白質內含子是寄生在去氧核醣核酸或是核醣核酸的合成酶基因中,另外有百分之二十五是寄生在代謝相關的基因上。也因為它具有自切的生化特性,它在蛋白質工程的應用上相當廣泛,像是蛋白質合成、藥物研發或是基因治療均有不少的應用。 除了它獨特的生化特性外,另外Liu(2000)與Pietrokovski(2001)的爭論中尚未能有一個結論的焦點便是它的演化過程,而為了能夠發現更多的蛋白質內含子以及在它演化樹的探討上有所幫助,我們便設計了一個工具來幫助我們達到這樣的目的,這也是撰寫本篇論文的主要原因。 蛋白質內含子以功能區域來說,可以分為兩大區塊,一部分是自切區塊(splicing domain),另一部分則是DOD內切酶區塊(DOD Endonuclease domain)。其中DOD內切酶區塊會大約而不精確的辨識宿主去氧核醣核酸序列中14~40個左右的鹼基片段,進一步將其序列切開造成歸巢現象(homing process)跟散佈蛋白質內含子的序列。因此由功能上可以看出DOD內切酶區塊並不是一個蛋白質內含子構成必要的區塊,而約莫有百分之十的蛋白質內含子僅有自切區塊,我們稱之為微型蛋白質內含子(mini-intein)。因為DOD內切酶區塊的不具必要性,加上它因著要辨識不同的宿主基因而變化性也大,在本篇研究中,我們並不會利用到它。實驗中主要使用的序列是自切區塊內的A、B、F跟G功能域(motif)。進一步利用支持向量機(SVM),可成功的分辨出蛋白質內含子的序列。

並列摘要


An intein is a parasitic genetic element similar to self-splicing introns . Intein are able to splice itself from its host protein and rejoin other protein segments, namely exteins, without influencing the function of the host protein. The self-splicing process of an intein is a spontaneous reaction. Up to date, more than 300 distinctive inteins have been discovered and are archived in the public intein database (Inbase). Inteins can be found in various living organisms across three life domains, i.e. archaea, eubacteria and eucarya. Currently, approximate seventy percentage of discovered intein parasitize in the host genes of DNA/RNA polymerase, while other 25% or more can be retrieved in the genes of metabolism. Inteins are very versatile in the biotechnological applications, e.g., protein synthesis, drug development, gene therapy , and so on. It may be ascribed to that inteins are very efficient in protein splicing. Important as it is, the evolutionary process of inteins remains controversy and the classification of inteins has been rarely investigated. To assist the understanding and discovery of intein sequences, we propose a tool to predict intein sequences based on known inteins. The tool can distinguish inteins from other proteins and hence may help the identification of inteins in a host protein. With this tool, we may be able to find some sequence features of inteins to boost the understanding of their evolutionary process. Intein can be functionally divided into splicing domain and DOD Endonuclease domain, as shown in Figure 2. DOD Endonuclease domain can recognize sites of 14–40 DNA residues and usually does not require a complete match with the target sequence for a homing process to spread intein. Accordingly DOD domains of inteins may vary with different target genes. Nevertheless, DOD domain is not a necessary domain for an intein. An intein without DOD domain is regarded as a mini-intein. On the other hand, splicing domain exists in all inteins and plays a very crucial role for the function of protein splicing. As a result, we adopt the A, B, F, and G motifs in splicing domain as features to characterize an intein for the purpose of classification. In this study, we adopt support vector machine (SVM) technique to classify inteins from other proteins.

參考文獻


Perler, F.B.: InBase, the Intein Database. Nucleic Acids Res. 30, 383-384. (2002)
Pietrokovski, S.: Intein spread and extinction in evolution. Trends Genet. 17, 465–472. (2001)
Liu X.Q.: PROTEIN-SPLICING INTEIN: Genetic Mobility, Origin, And Evolution. Annual Review of Genetics. 34: 61-76 (2000)
Schwarzer, D., and Cole, P.A.: Protein semisynthesis and expressed protein ligation: chasing a protein's tail. Curr Opin Chem Biol 9(6): 561–9 (2005)
De Grey, A.D.N.J.: Mitochondrial gene therapy: an arena for the biomedical use of inteins. Trends Biotechnol. 18(9): 394-399 (2000)

延伸閱讀


  • Chen, C. J. (2004). 蛋白質迴圈結構預測 [master's thesis, National Taiwan University]. Airiti Library. https://doi.org/10.6342/NTU.2004.00747
  • 鄒文雄、黃明經(1997)。蛋白質結構之電腦預測化學55(3),101-109。https://doi.org/10.6623/chem.1997038
  • Chiu, C. C. (2007). 蛋白質二級結構特徵分析與預測 [master's thesis, National Tsing Hua University]. Airiti Library. https://www.airitilibrary.com/Article/Detail?DocID=U0016-1411200715151226
  • 林瑋嬪(2017)。Introduction (I)考古人類學刊(86),191-192。https://doi.org/10.6152/jaa.2017.6.0008
  • 余啟輝(2013)。Evaluation of Infinite Series育達科大學報(34),229-239。https://www.airitilibrary.com/Article/Detail?DocID=a0000568-201304-201305230026-201305230026-229-239

國際替代計量