透過您的圖書館登入
IP:3.136.26.20
  • 學位論文

以球模型編碼快速索引蛋白質結構比對

Protein Structure Comparison Based on Encoding and Fast Indexing of Sphere Model

指導教授 : 歐陽彥正
共同指導教授 : 黃乾綱
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本篇論文是提出以球模型編碼為基礎,快速索引蛋白質結構比對(Protein Structure Comparison)的一個方法論。首先我們會介紹先前之研究:以橢球模型編碼為基礎,進行蛋白質結構比對的一個方法論;EMPSC (Ellipsoidal Model for Protein Structure Comparison)方法論。這是一個兩階層的蛋白質結構比對方法,以二級結構為出發點,嘗試去解決其他方法的缺點並提出蛋白質結構區域比對的能力。進一步地,我們利用EMPSC 的區域比對的能力去嘗試偵測區域結構一致性(Structure Conservation). 在實驗結果中,我們觀察有找出區域對應結構大小不一致且極為分散之問題。後來我們利用NRS (Neighborhood Residues Sphere)的觀念去重新做此偵測區域結構一致性之問題,這獲得了良好的結果且解決結構大小不一致之問題。最後為了增進比對之速度,我們提出了以NRS球模型為基礎,對胺基酸周圍區域結構(10 Å)做編碼,進而快速索引蛋白質結構比對。由實驗觀察得知,我們的方法:ESC (Environmental Signature Cluster),具有大量蛋白質結構資料庫搜尋(Search)之能力與蛋白質區域結構比對之能力,將可進一步快速提供整個蛋白質資料庫之結構挖掘(Structure Mining)的可能性。

並列摘要


This thesis proposes a new method for PSC (Protein Structure Comparison) based on encoding and fast indexing of sphere model. At first, we try to create a fast PSC tool, EMPSC (Ellipsoidal Model for Protein Structure Comparison), hope to solve some drawbacks of other algorithms and provide an abililty of local alignment. Second, we apply the local alignment capability of EMPSC to try to detect structure conservation. We encounter the problem of variable size of finding local alignment region, third, so we apply NRS (Neighborhood Residues Sphere) concept to fix a size (10 Å) of local alignment region. From NRS sequence-structure clustering and comparisons, we also try to detect structure conservation. The applications using these algorithms are proven workable in the same EC class. At last, via the training of NRS related experiments, we propose another new ESC (Environmental Signature Cluster) PSC method. Try to provide an indexing methodology in three-dimensional geometry of protein local structure, let this method process the capability of massive database search and local alignment finding. The ESC method has fast provided the probability of mining structure conservation for whole PDB database. EMPSC:a fast PSC tool based on ellipsoidal model First, we propose a new method EMPSC for the well-known PSC (Protein Structure Comparison) problems. The proposed method EMPSC is a protein structural alignment algorithm based on ellipsoidal model abstraction. We segment the protein 3D structure into two different kinds of structures, including Secondary Structure Elements recognized by DSSP [Kabsch 1983] and other coil/loop structures. These SSEs (Secondary Structure Elements) will be the initial alignment center for obtaining the transformation coordinate systems. Different heuristic filters and geometric hashing based global alignment estimation are used for quick finding better initial alignments. In the refined alignment stage of analysis, a standard refinement algorithm is invoked to fine-tune the alignment outputted by the first stage. Our experimental results reveal that EMPSC generally achieves comparable accuracy and better performance in comparison with the existing PSC algorithms. Moreover, we analyzed the factors that affect the EMPSC performance and SSE-based PSC algorithms. Further investigation in multiple protein structure comparison and local structure comparison will be continued. ESC:another faster PSC tool based on sphere model Second, in this paper, our proposed method, Environmental Signature Cluster method (ESC), uses residues environmental signature based on Neighborhood Residues Sphere (NRS) concept to index three-dimensional geometry of protein local structure. With NRS local geometry indexing, we digitize protein structure into pieces of environmental signature of NRS which makes our method can process massive database search and local alignment finding, whatever one-against-all protein comparisons. So far, ESC can provide the similarity degree among proteins quickly. However, ESC method currently is very good for constraint local structure alignment and applying this fast method in one-to-all PDB (Protein Data Bank) comparison is workable. In average, we can output alignment result about 15 minutes while randomly selecting 50 protein chains to test one-against-all whole PDB search. The experimental results reveal that our proposed method possesses the capability of massive database search and fits for local structure identification and local structure conservation discovery.

參考文獻


Ding D. F., Qian J. A. & Feng Z. K., Bull. Math. Bial., 56, 923, 1994.
Levine M., Stuart D. & Williams J., Aclu Crvstallupr., A40, 600, 1984.
Abagyan R. A. & Maiorov V. N., J. Biomol. Struct.Dynant. 5, 1267, 1988.
Abagyan R. A. & Maiorov V. N., J. Biomoi. Struct.Dynam. 6, 1045, 1989.
Alexandrov N. N. & GIN N., (1994) Prol. Sci., 3, 866, 1994.

延伸閱讀