透過您的圖書館登入
IP:3.144.86.134
  • 學位論文

新的定量化合物結構與生物活性關係模型在階層分群法及遺傳演算法的實際運用

A New Quantitative Structure-Activity Relationship Model for Practical Applications using Hierarchical Clustering Genetic Algorithms

指導教授 : 高成炎

摘要


定量化合物結構與生物活性關係(QSAR)的主要目的,是在化合物的物化特性以及該化合物的生物活性之間,制定出數學上的對應關係。其推導出的QSAR模型之後便能夠從事許多的實際運用,例如化合物的分類、藥物運作機制的診斷、生物活性預測、以及前導藥物的最佳化等等。因此QSAR一般被公認為是電腦輔助藥物設計中最好的方法。在這篇論文當中,我們會運用遺傳演算式部分最小平方法(GA-PLS)與階層分群式部分最小平方法(HC-PLS),來建立一個可靠並且具備多功能的QSAR模型。 根據一連串的研究,Selwood和Holloway兩個生物資料集成功地驗證了我們的模型。其中的優點可以概述如下:首先,GA-PLS能夠挑選出決定生物活性時扮演重要角色的顯著分子特徵,而且透過改良GA-PLS缺點所設計的兩個新要素,隱含變數編碼染色體以及偏向突變,使得GA-PLS在效能和準確性上都有了進一步的提升。另外,HC-PLS可以鑑別出資料集當中具代表性的化合物,來幫助生物活性預測與群集特性的深入分析。並且我們已經從實驗中證實,除了分子特徵之外,將生物活性納入計算化合物相似性的基準當中,更有機會發現真正在物化及生物特性上相似的化合物。 最後的成果令人滿意,由GA-PLS和HC-PLS所推導出的QSAR模型,除了具有高度預測能力之外,也同時增進我們對藥物運作的瞭解,以及提供在藥物設計上的理論基礎。

並列摘要


The purpose of quantitative structure-activity relationship (QSAR) is to formulate mathematical relationships between physico-chemical properties of compounds and their experimentally determined in vitro biological activities. The derived QSAR model can be subsequently applied to many practical applications, such as compound classification, diagnosis of drug mechanism, prediction of biological activity, and lead optimization. QSAR are commonly regarded as the best approaches to computational molecular design. To develop a reliable and versatile QSAR model, genetic algorithm-based partial least squares (GA-PLS) and hierarchical clustering-based partial least squares (HC-PLS) are employed in this thesis. According to a series of studies, the results have been successfully validated by Selwood and Holloway data sets. The benefits of our model can be summarized as follow. First, GA-PLS is capable of selecting the significant molecular descriptors that play an important role in determining biological activity. By means of encoding the latent variable of PLS into chromosome and combining biased mutation with uniform mutation, GA-PLS can further improve the efficiency and accuracy of QSAR model. Second, HC-PLS is able to discriminate the representative compounds in the data set to facilitate molecular property prediction or to further analyze the subsets. Based on the comparison between molecular descriptors and biological activities (actual values for the training data and predicted values for the test data), the similar compounds have more potential to exhibit similar physicochemical and biological properties. With the encouraging achievements, the highly predicted QSAR model derived by GA-PLS and HC-PLS not only enhances our understanding of the specifics of drug action, but also provides a theoretical foundation for future lead optimization.

參考文獻


[3]Barnard, J. M., Downs, G. M., Clustering of Chemical Structures on the Basis of 2-D Similarity Measures. J. Chem. Inf. Comput. Sci. 32, 644-649 (1992).
[4]Baroni, M., Costantino, G., Cruciani, G., Riganelli, D., Valigi, R., Clementi, S., Generating optimal linear PLS estimations (GOLPE): An advanced chemometric tool for handling 3D-QSAR problems. Quant. Struct.-Act. Relat. 12, 9-20 (1993).
[5]Besler, B.H., Merz, K.M., Kollman, P.A., Atomic Charges Derived from Semiempirical Methods. J. Comput. Chem. 11, 431-439 (1990).
[6]Blaney, J.M. et al., Chem. Rev. 84, 333-407 (1984).
[8]Brown, R.D., Martin, Y.C., Use of structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection. J. Chem. Inf. Comput. Sci. 36, 572-584 (1996).

延伸閱讀