透過您的圖書館登入
IP:184.72.135.210
  • 學位論文

建立多項變數之線性回歸預估模型--以DPP-IV抑制劑為例

Building the Multivariate Linear Regression Model of DPP-IV Inhibitors

指導教授 : 李元綺

摘要


藉由醫學資訊的思考邏輯與應用的各式工具,將已知的疾病以生物醫學角度考量,運用公開發表期刊裡的相關疾病之實驗數據與採用適當的分析工具分析探求已知能運用在治療糖尿病的 DPP-IV 蛋白抑制能力之化學結構的定量構效關係(Quantitative Structure Activity Relationship; QSAR)。 主要以 HyperChem 程式運算得到個別化學分子結構的物理化學特徵值,進而運用其運算能力建立 QSAR 模組,以統計之多項變數線性回歸方式運算其個別物理化學特徵得回歸方程式。 將其產生的多項變數線性回歸方程式套用在其他的化學結構,如已知為藥物分子或非藥物分子,進而探求其成為藥物的可能性。 本研究所得之兩組回歸方程式:公式 1 (Multivariate Linear Regression; MLR):log(1/IC50) = -7.140 – 0.061*NRB + 0.502*NHB + 0.402*clogP – 0.010*MW – 0.542*Ehomo + 0.093*Nlc – 0.152*CMR + 0.152*Nr + 0.000*Hf (R2 = 0.832)與公式 2 (Partial Least Squares; PLS):log(1/IC50) = -7.13967 – 0.06097*NRB + 0.50192*NHB + 0.41951*clogP – 0.00956*MW – 0.54250*Ehomo + 0.09312*Nlc – 0.15154*CMR + 0.15194*Nr + 0.00040*Hf (R2 = 0.845885)。經過統計方法檢定為具有線性回歸的特性,具有預估與推導能力與解釋力。 對 log(1/IC50) 具有影響的特徵值依序為 Ehomo > NHB > clogP > CMR > Nr。意即立體結構的能量變化、氫鍵數量的多寡與結構的水溶-脂溶性的強度,相對於其餘的特徵值較為重要。 MLR 與 PLS 方法之間的分析藉由 t - test 檢定得: IC50 數據與 MLR 推估活性數據有顯著差異,MLR 推估活性數據顯著的高於 IC50 數據。 沒有足夠證據顯示 IC50 數據與 PLS 推估活性數據有顯著差異,PLS 推估活性數據稍遜於 IC50 數據。 與其他方法比較是略遜於 Paliwal 的分析結果。 本研究嘗試證明隨機選取的藥物分子,能在經由迴歸分析得到抑制能力之結果,事實證明先前推導出的線性迴歸公式,在其他非 DPP-IV 抑制劑結構上並沒有邏輯上的解釋能力,推論其結構上並不能與 valine-pyrrolidide 結構相匹配。 另外,嘗試對於已知具有調整血糖的小分子結構進行分析,所得之結果大致均表現出強活性,IC50 ≦ 100 nM 的數據。

並列摘要


Because of medicine information's ponder logic and the application various types tool, the known illness will get sick by the biomedicine angle consideration, the utilization publishes in the periodical related illness to get sick publicly the empirical datum with to use the suitable analysis tool analysis search known to be able to utilize is treating diabetes' of chemical constitution DPP-IV protein inhibiting ability quantitative Structure – Activity Relationship (QSAR). Mainly obtains the individualizing substructure physical chemistry characteristic value by the HyperChem program operation, then utilizes its operational capability to establish the QSAR model, the statistics many variable linear regression way operates its individual physical chemistry characteristic to result in the regression equation. Multivariate variable linear regression equation which produces it applies mechanically in other chemical constitution, if known for medicine member or non-medicine member, then seeks it to become the medicine the possibility. This research obtained two group of regression equation: Formula 1 (Multivariate Linear Regression; MLR): log(1/IC50) = -7.140 - 0.061*NRB + 0.502*NHB + 0.402*clogP - 0.010*MW - 0.542*Ehomo + 0.093*Nlc - 0.152*CMR + 0.152*Nr + 0.000*Hf (R2 = 0.832) and formula 2 (Partial Least Squares; PLS): log(1/IC50) = -7.13967 - 0.06097*NRB + 0.50192*NHB + 0.41951*clogP - 0.00956*MW - 0.54250*Ehomo + 0.09312*Nlc - 0.15154*CMR + 0.15194*Nr + 0.00040*Hf (R2 = 0.845885). To have the linear regression characteristic after the statistical method examined, has the estimate and inferential reasoning ability and the explanation strength. To log(1/IC50) has the influence characteristic value is Ehomo > NHB > clogP > CMR > Nr. Mainly spatial structure's energy change, how much are hydrogen bond quantity with the structure water-soluble - fat soluble intensity, opposite in other characteristic values are more important. MLR and between the PLS method's analysis examines because of t - test: The IC50 data and MLR estimate the active data to have the remarkable difference, MLR estimate the active data remarkable to be higher than the IC50 data. The evidence had not demonstrated that fully the IC50 data and PLS estimate the active data to have the remarkable difference, PLS estimate the active data are inferior the IC50 data slightly. It slightly is inferior the Paliwal analysis result with other method comparison This research attempts the certificate stochastic selection the medicine molecular, can in obtain result of the inhibiting ability by way of the regression analysis, the fact proved formerly inferred the linear regression formula, does not have in the logical explanatory ability in other non-DPP-IV inhibitor structure, deduces in its structure not to be able to match with the valine-pyrrolidide structure. Moreover, attempts regarding known has the adjustment blood sugar small molecular structure to carry on the analysis, the obtained result displays the strong activeness approximately, the IC50 ≦ 100 nM data.

參考文獻


2. Adams, C.P. and V.V. Brantner, Estimating The Cost Of New Drug Development: Is It Really $802 Million? Health Affairs, 2006. 25(2): p. 420-428.
3. DiMasi, J.A., R.W. Hansen, and H.G. Grabowski, The price of innovation: new estimates of drug development costs. Journal of Health Economics, 2003. 22(2): p. 151-185.
4. Ponmary Pushpa Latha and J.S. Sharmila, QSAR study for the prediction of IC50 and Log P for 5-N-Acetyl-Beta-DNeuraminic Acid structurally similar compounds using stepwise (multivariate) linear regression. International Journal of Chemical Research, 2010. 2(1): p. 32-38.
5. Wang, R.B., et al., Structure–activity relationship: analyses of p-glycoprotein substrates and inhibitors. Journal of Clinical Pharmacy and Therapeutics, 2003. 28(3): p. 203-228.
6. Paliwal, S., et al., Development of a robust QSAR model to predict the affinity of pyrrolidine analogs for dipeptidyl peptidase IV (DPP- IV). J Enzyme Inhib Med Chem, 2011. 26(1): p. 129-40.

延伸閱讀