表現序列標籤對齊工具的設計與實作

本論文在設計一套將表現序列標籤資料庫裡的序列，對齊到其所屬之基因體的軟體。主要使用的方法是實作多層單標籤表格，並藉由網格的概念進行分散式運算。目的在使個人電腦能分析大量的表現序列標籤，以產生重要的相關資訊，好使生物資訊相關領域的學者與研究人員都能在更低的成本之下，得到所需求的資訊。近年來，全世界的分子生物實驗室產生大量的DNA序列資料，而這些龐大的資料量早已非人力所能處理。從1990年BLAST方法被提出之後，幾乎所有處理有關生物資訊的問題都是使用這個軟體來處理；可是加快它處理的速度仍有其必要性，所以又有很多改善它的方法被提出。但重點是，生物學家對新的軟體不僅希望其在解決問題的速度上，更希望在準確度上也獲得提昇。本論文致力於符合這項要求，好使生物學家能正確且快速地推測出基因序列與基因結構，並讓此軟體成為分析表現序列標籤資料庫之公開的基石。

關鍵字

表現序列標籤；序列對齊；分散式運算

並列摘要

This thesis is to design the software, which can align the sequence of Express Sequence Tags database to its own genome. The main way is to implement Multi-Layer Unique Makers table, and use the concept of grid to reach distributed computing. The purpose is to analyze large amount of Express Sequence Tags in a personal computer, and produce important information that is relevant. Also to allow biologists and researchers who are in the field of bio-informatics to acquire the information they need within a lower budget. In the recent years, a tremendous amount of DNA sequence information has been developed by molecule bio-labs from the entire world. Since BLAST was published in 1990, almost all the bio-information problems depended on the use of it. However, the speed of managing this information still needed to be emphasized. Therefore, many ideas were brought up to improve the old way. The key points of the new software for the biologists are not only to improve the speed on problem solving, but also on the accuracy. This thesis is trying to emphasize this kind of demand. It will allow biologists to find out the gene sequences and gene structures much quicker and correct. It will also become the public open basis for Express Sequence Tags analysis.

並列關鍵字

expressed sequence tag ； sequence alignment ； distributed computing

參考文獻

[1] Altschul, S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. (1990) Basic local alignment search tool. J. Mol. Biol. 215: 403—410.

[2] Chao, K. M., Zhang, J., Ostell, J. and Miller, W. (1995) A local alignment tool for very long DNA sequences. Comput. Appl. Biosci. 11: 147—153.

[3] Gelfand, M. S., Mironov, A. A. and Pevzner, P. A. (1996) Spliced alignment: A new approach to gene recognition. Proc. Natl. Acad Sci. 93: 9061—9066.

[4] Sze, S. H. and Pevzner, P. A. (1997) Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment. J. Comput. Biol. 4: 297—309.

[6] Chao, K. M., Zhang, J., Ostell, J. and Miller, W. (1997) A tool for aligning very similar DNA sequences. Comput. Appl. Biosci. 13: 75—80.

國際替代計量

表現序列標籤對齊工具的設計與實作

未授權

主題瀏覽