透過您的圖書館登入
IP:3.145.168.76
  • 學位論文

一種基於特殊鹹基序列分佈組合的快速非比對式原核生物全基因體種系分類法

Whole Genome Phylogeny of Prokaryotes by Unique Sequence Profiles: A Rapid Alignment-free Approach

指導教授 : 王世融

摘要


傳統的細菌的種系分類法,大多僅利用幾個保守性基因的鹼基序列變異程度來做為親疏遠近的分類依據,但是這些保守性基因在演化關係相近的物種或品系經常不具有任何變異,亦即無法用來做為區分相近的物種,近年來由於大量的細菌基因體序列被解碼,利用細菌的全基因體序列做為分類的依據已成為分類法的新趨勢,目前已有多種分類工具被發展出來,按照是否有進行序列比對程序,可以將這些分類法區分為比對和非比對式全基因體分類法,然而這些現存方法均需要利用龐大的電腦運算能力來計算全基因體序列資料,如果要同時處理很多細菌種系,則必需用到很昂貴的電腦資源,而且耗時甚久,本研究利用特殊鹹基序列的分佈組合來發展一種快速非比對式原核生物全基因體種系分類法,此分類法僅需選取極少量的特殊代表性鹼基序列做為分類依據,而省略了諸多現存方法中,浪費於計算龐大但無代表性序列出現頻率的步驟,因此可以更有效率地處理日益增多的細菌基因體資料,此分類法所需電腦設備極低,所有的程式均可以安裝於運行Linux系統的家用電腦。

並列摘要


Traditional methods for bacterial phylogeny are mostly constructed based on the variations from a few conserved genes. However, the variations of these conserved genes among evolutionary close species or strains are generally indistinguishable or even none. Therefore, it is almost impossible to draw any phylogenetic conclusion for very close species using conserved genes approach. Thanks to the increasing amount of completed bacterial genomes, much recent attention has turned to the application of whole genome for bacterial phylogeny. Many computational tools have been created for whole genome phylogeny. These tools are frequently being categorized as alignment or alignment-free whole genome phylogeny depending on whether or not the sequence comparison procedure being used. For the concurrent exhaustive calculation of large amount of genome sequence information, the existing methods tend to be time-consuming and have to rely on extensive computing power. This project aims at developing a rapid alignment-free whole genome phylogeny method for prokaryotes by using the unique string profiles. The method makes whole genome phylogeny by using only small amount of representative unique sequences from each genome. Without the need of counting large amount of redundant sequences frequencies existing in previous tools, it can be more efficiently handle the rapidly increasing number of bacterial genomes. This tool requires low computing power and all the programs can be installed in a desktop PC running on LINUX system.

並列關鍵字

phylogeny tree unique K-mer whole genome

參考文獻


Chen LY, Lu SH, Shih ES, Hwang MJ. (2002) Single nucleotide polymorphism mapping using genome-wide unique sequences Genome Res. Genome Res. 12(7):1106-11.
Coenye T, Vandamme P. (2003) Intragenomic heterogeneity between multiple 16S ribosomal RNA operons in sequenced bacterial genomes. FEMS Microbiol Lett. 228(1):45-9.
Delcher, L.A. et al. (1999) Alignment of whole genomes. Nucleic Acids Research, Vol.27, No. 11, 2369-2376
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution 17 (6): 368–376.
Lee K.J. et al. (2004) Biomedical named entity recognition using two-phase model based on SVMs. Journal of Biomedical Informatics 37, 436–447.

延伸閱讀