隱桿線蟲屬(Caenorhabditis)以廣為生物研究應用的模式生物“秀麗隱桿線蟲(C. elegans)”聞名。相較於其豐富的遺傳暨分生研究知識背景,科學家對於C. elegans演化方面的認知相對來得有限。主要的原因在於缺乏來自與許多不同親鄰物種比較研究後而得的精密演化架構。在本論文中,基於十一種隱桿線蟲屬物種基因體的可用性,比較基因體學可作為研究此主題的一個有效方法。本論文分為五章,首章與末章分別為背景介紹與結語,二到四章為研究主題的呈現。 在論文的第二章裡,我首先運用秀麗隱桿線蟲的姊妹物種——C. inopinata,與C. elegans做了一個染色體規模的基因體比較;由於姊妹物種所代表的親近演化親緣關係,再加上兩者都具有組裝完整的基因體,高解析度的基因體比較可以藉由顯現出兩者於基因族群以及基因同線性上的些微差異來探討兩者的物種獨特性。 而在第三章中,我想藉由現有的C. elegans完整基因體去將其他未完整組裝成染色體規模的隱桿線蟲基因體做分群歸納,試圖讓這些線蟲也能以染色體規模的方式被研究。通過我篩選條件的序列片段能剛好被分成六群象徵著隱桿線蟲屬基因體共有的六條染色體;然而,未通過篩選條件的序列片段並不佔少數,進一步的分析發現這些情形多數是因為其基因體組裝的過於破碎而不完整或過度組裝。 於是在第四章,我藉由比較基因體學中一塊重要的下游分析——基因同線性分析,來系統性地探討基因體組裝長度上的完整性對於其下游比較分析的影響;最後,針對比較基因體學分析的資料穩定性與正確性而言,我在此提出了一個基因體組裝長度完整性評估統計值N50至少要達1Mb的基本需求,而此需求條件會依據物種的基因密度而進一步有所調整。
Caenorhabditis genus is known mainly for presence of a model species Caenorhabditis elegans, which is widely used in biological research. In contrast to abundant genetics and molecular biology knowledge accumulated in this model species, the evolutionary and ecological contents of C. elegans remain relatively unexplored. This inadequacy is due to lack of an explicit evolutionary framework made from comparing closely related species. In this thesis, with the availability of 11 Caenorhabditis species genomes, comparative genomics provides a useful way to investigate this topic. The thesis is divided into five chapters. The first and last chapters are Background and Conclusions, respectively. Chapter 2~4 are standalone topics but are related to each other. In Chapter 2, I have carried out comparative genomics analyses between C. elegans and its sister species C. inopinata. As a result of their closely related phylogenetic relationship and high quality genome assemblies, genome wide comparisons at high resolution in features such as gene families and synteny can be partitioned according to chromosomes and achieved for a deep evolutionary interpretation of their species uniqueness. In Chapter 3, the comparisons were carried out at larger scale that across several branches using 11 available Caenorhabditis species genomes. I have shown that selected scaffolds of each species can be assigned to six linkage groups representing six chromosomes. Inspecting the exceptions revealed a striking case of over-assembly as well as the issue of incomplete assembled genomes. In Chapter 4, to investigate the interplay between assembly contiguation and downstream analysis, I evaluated synteny in different contiguation assemblies of model nematodes in Caenorhabditis and Strongyloides. I have demonstrated that a minimum standard of N50 depending on species gene density is required for a robust downstream study such as synteny analysis in comparative genomics.