透過您的圖書館登入
IP:52.14.130.13
  • 學位論文

以相近物種參考序列進行基因體重組之半組裝法

A Semi-Assembly Approach for Genome Reconstruction Using Closely-Related Reference Sequences

指導教授 : 黃耀廷
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來由於許多物種已經完成基因體定序和組裝,新定序之物種常會相近於已完成定序組裝之物種。然而基因體通常含有複雜的重複序列,造成現有之序列組裝(de novo assembly)方法容易組出破碎之基因體。本篇論文中,我們設計一新穎組裝方法(稱為SemiAssembler),利用相近物種之基因體序列(closely-related genome sequences),並結合傳統序列組裝法,以完成該物種之基因體組裝。首先,我們修改相近物種之基因體序列,藉由插入或刪除該物種特有之序列,創造出一初版基因體。接著,將短序列組裝出的大片段序列(contigs)對映回初版基因體,而初版基因體序列將被對映之大片段序列逐一取代,此可反應物種間的單一核苷酸多型性(SNP)及小片段之插入刪除(indel)。模擬實驗只初我們的方法有很高之精確度(Precision)與召回率(Recall)。我們利用此程式組裝二株不同水稻品種。藉由聚合酶鏈鎖反應(PCR)實驗,驗證了我們方法所找到不同大小之插入與刪除,部分確為此二品種基因體之差異。

並列摘要


In recent years, as many genomes have been sequenced and assembled, the newly-sequenced genomes are often closely-related to an existing genome. However, owing to complex repeat structures in the genome, the genomes assembled by existing methods are often highly fragmented. In this thesis, we design a semi-assembly approach (called SemiAssembler) which integrate reference-mapping approaches and de novo assembly to reconstruct a newly-sequenced genome using closely-related genome sequences. A draft genome is first created by adding (removing) inter-species insertions (deletions) to (from) the related genome, respectively. Subsequently, the draft genome sequence is replaced with the contig sequences assembled from short reads, which aims to reflect inter-species SNPs and small-sized indels. Simulation results indicated our method has high precision and recall rates. The program is used to assemble two O. Sativa genomes. A substantial amount of large insertions/deletions and small indels found by our method were validated by PCR.

參考文獻


[1] R.H. Waterston. Initial sequencing and comparative analysis of the mouse genome.
Nature, 420:520–562, 2002. doi: 10.1038/nature01262.
[2] J.C. Venter. The sequence of the human genome. Science, 291:1304–1351, 2001.
[3] International Rice Genome Sequencing Project. The map-based sequence of the
rice genome. Nature, 436:793–800, 2005. doi: 10.1038/nature03895.

延伸閱讀