透過您的圖書館登入
IP:18.204.214.205
  • 學位論文

生物序列比對處理器之平行架構與硬體實現

Parallel Architecture and Hardware Implementation of Biological Sequence Alignment Processors

指導教授 : 盧奕璋

摘要


本文提出生物序列比對計算之平行處理器硬體架構,能夠將目標序列和檢測序列做比對,依照兩者間的相似度輸出排序結果。並對此處理器架構以硬體加速器方式實現,期待可以用來改善近年來因生物序列的數量大幅增加而造成比對時間隨之增加的問題。 所提出的處理器除了使用平行處理架構模式加快速度之外,也因為使用階層式架構設計而達到管線化的效果,處理時間可以更進一步的縮短;在面積方面,雖然使用平行處理,理論上面積會呈現線性的增加,但是因為引進階層式架構,較低階層的電路可以被較高階層的電路共用,所以整體電路的面積並不會隨平行處理器的數量呈現線性遞增。所以在處理時間上一方面可以取得平行處理的優點,同時硬體面積上並不會完全的表現傳統平行處理的缺點。除此之外,本文也針對此處理器提出一個可擴充式架構,可連結多個處理器來處理更長的目標序列長度,對生物序列比對的議題有更完整的硬體解決方法。 使用TSMC90nm的邏輯閘資料庫實現一個比對目標為蛋白質序列的平行架構處理器,可以處理目標序列達1,048,576條、任一個目標序列及檢測序列長度達8,192個胺基酸,此處理器達到操作頻率100MHz及電路面積32mm2。其相似架構亦可應用於核苷酸序列比對。

並列摘要


This thesis proposes a parallel architecture for biological sequence alignment that compares a query sequence to subject sequences. It receives sequence inputs and generates a list sorted by the similarity of the two sequences. Based on the architecture, we implement a hardware accelerator to reduce the fast growing execution time for sequence alignment due to many new sequence discovered in recent years. The proposed processor adopts both the parallel architecture and the hierarchical architecture to reach pipeline-level speedups. Though the chip area is linearly proportional to the number of parallel channels in theory, we have adopted the concept of block reuse to alleviate the impacts while reducing the processing time by parallelism. The hardware of low-level circuits can be reused by the hardware of high-level circuits. Besides, this thesis proposes a scalable architecture to concatenate more than one processor to extend the allowable lengths of sequences. The scalable architecture is suitable for processing longer subject sequences in the future. Thus, it provides a feasible solution for biological sequence alignment hardware. A parallel sequencing processor for protein sequences is implemented using TSMC 90nm cell library. It is capable of processing 1,048,576 subject sequences, and the length of query and subject sequences can be as long as 8192 amino acid residues. This processor operates at 100MHz and its chip area is 30 mm2

參考文獻


[1] S. B. Needleman and C. D. Wunsch, ” A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol.48, no.3, pp. 443-453, Mar. 1970.
[2] T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, no. 1, pp. 195–97, Mar. 1981.
[3] O. Gotoh, “An improved algorithm for matching biological sequences”, Journal of Molecular Biology, 162, pp. 705-708, 1982.
[4] W. Pearson and D. Lipman, “Improved tools for biological sequence analysis,” Proc. Natl. Acad. Sci. USA, vol. 85, pp. 2444-2448, 1988.
[5] S. F. Altschul et al., “Basic local alignment search tool,” Journal of Molecular Biology, vol.215, pp. 403-410, 1990.

延伸閱讀