快速分析蛋白質立體次結構之研究

分析蛋白質(protein)與配體(ligand)之間的交互作用, 在藥物設計是一項重要的議題, 為了能夠得到詳細而且精確的分析結果必須計算原子之間的自由能(free energy)並牽涉到熱力學甚至量子力學; 然而這些計算的時間複雜度都非常地高, 因此在電腦輔助製藥上, 經常利用分析蛋白質和配體的立體結構進行過濾, 藉此加速分析的速度. 在這方面有一個值得注意的現象就是, 大部分蛋白質與配體的結合, 往往取決於蛋白質表面一小部分的子結構, 而蛋白質中大部份的部位對於結合過程則沒有決定性的影響, 因此, 如果能夠快速找出蛋白質三級結構中位於表面或是凹槽部位的胺基酸, 將會有助於加速整個分析的過程. 在這篇論文中, 我們提出了一個時間複雜度為$O(nlogn)$的過濾演算法, 其中$n$表示蛋白質的胺基酸個數. 這篇論文所提出的演算法利用本實驗室最近開發的核心密度預測演算法(kernel density estimation algorithm)為基礎, 跟電腦圖學(computer graphics)方面常用來尋找三維立體模型表面的$alpha$-hull演算法比較起來, 本演算法將時間複雜度從$O(n^2)$降為$O(nlogn)$, 使得過濾程式的執行效能有了明顯的改善. 實驗結果顯示經過我們提出的演算法過濾之後, 可以在完全沒有降低準確度的情形下, 加速整個分析的速度從24.91倍到83.53倍. 本論文同時實作所提出的演算法並開發成一套軟體工具, 可以用來在PDB(Protein Data Bank)資料庫中尋找具有相似子結構的蛋白質, 這套工具的結果可以提供給生物學家一些有用的線索進行更進一步的研究工作.

關鍵字

蛋白質結構；核心密度預測

並列摘要

One of the fundamental issues in drug design is analysis of protein-ligand interactions. The detailed and accurate analysis of protein-ligand interactions involves calculation of binding free energy based on thermodynamics and even quantum mechanics. However, this approach is highly expensive in terms of computing time. As a result, conformational and structural analysis of proteins and ligands has been widely employed as a screening process in computer-aided drug design. One interesting observation in this regard is that for many applications only the substructures on the contour of a protein are of significance. Therefore, in order to expedite the analysis process, it is desirable to incorporate a mechanism that can effectively extract the residues in the proximities of the caves of protein tertiary structures. In this thesis, an efficient filtering process with $O(nlogn)$ time complexity is proposed, where $n$ is the number of residues in the protein. In comparison with the $alpha$-hull algorithm, which is a widely used algorithm in computer graphics for identifying those instances on the contour of a 3-dimensional object, the filtering process employed in this paper features a lower time complexity, $O(nlogn)$ versus $O(n^2)$. The low time complexity of the proposed filtering process is due to a novel kernel density estimation algorithm. Experimental results revealed that the proposed filtering mechanism is able to speed up the analysis process by a factor ranging from 24.91 to 83.53 times without trading off the accuracy of analysis. The software package developed with the mechanism proposed in this thesis has been applied to search for proteins containing a similar binding site to a well-studied crystal structure in PDB(Protein Data Bank). The experimental results provide the biochemists with some valuable clues for in-depth studies.

並列關鍵字

protein structure ； kernel density estimation

參考文獻

Bioinformatics, Benjamin Cummings.

[2] Lamdan, Y. and Wolfson, H. (1988) Geometric Hashing: A General

Conf. Computer Vision, 238-249.

Structural Data, Master thesis, Department of Computer Science

and Information Engineering, National Taiwan University, 2003.

國際替代計量

快速分析蛋白質立體次結構之研究

全文下載

主題瀏覽