透過您的圖書館登入
IP:18.216.31.88
  • 學位論文

用於預測細胞色素P450介導代謝的全面性數據集:機器學習和深度學習的系統評估

A Comprehensively-Curated Dataset for Prediction of Cytochrome P450 Isoforms Mediated Metabolism: A Systematic Evaluation of Machine Learning and Deep Learning Methods

指導教授 : 曾宇鳳
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


細胞色素CYP450的解毒能力對於調節我們體內中的藥物或有害化合物起著至關重要的作用。 其中,CYP1A2、CYP2C9、CYP2C19、CYP2D6、CYP2E1 和 CYP3A4 是負責 90% 以上藥物代謝的主要酵素。CYP450對於我們體內藥物的生物利用度有著顯著影響,而這也和藥理功效相關。因此,預測CYP450 介導的小分子代謝能加速配體的篩選過程並且對於藥物開發的早期階段至關重要。所有相關研究中用於預測CYP450介導代謝的QSAR模型所使用的數據集仍然不足且雜亂。此外,由於數據集中資料的有限,這些模型的表現也還不夠好或不適用。因此,本研究中我們系統性的評估了多種機器學習和深度學習方法,並且使用了多種優化方式,最後為六種重要的 CYP450 酵素構建了一組穩健的預測模型。此外,我們從所有可用的資料來源中蒐集了所有和CYP450介導代謝作用的化合物,經過一系列的前處理和人工驗證方法,最後提供一組最全面、高效且最新的化合物數據集。結果表明,經過我們所整理的數據集訓練的預測模型和其他研究的訓練集資料相比表現得更好,證明我們所使用的清理及驗證方式是有效的。另外,經過對各種演算方法的系統評估,圖形卷積網絡(GCN)所構建的預測模型性能最高,MCCs落在0.50(CYP2C19)和0.72(CYP1A2)之間。這些成果有助於其他CYP450交互作用的預測模型開發,並能支持藥物開發過程中的化合物篩選。

並列摘要


The detoxification ability of cytochrome P450(CYP450) enzymes plays an essential role in regulating the existence of drugs or harmful compounds in our bodies. CYP1A2, CYP2C9, 2C19, CYP2D6, CYP2E1, and CYP3A4 are the main enzymes in charge of over 90 percent of drug metabolism. Consequently, they significantly affect the bioavailability of drugs in our body, which are related to pharmacological efficacy. Therefore, predicting the property of small molecules related to CYP450-mediated metabolism is essential for the early stage of drug discovery by accelerating the ligand screening process. Datasets used in QSAR models for CYP450 substrate prediction from all related studies are still insufficient and inconsistent. Furthermore, the performance of these models is also insufficient or non-applicable due to the limited dataset. In this case, we construct a set of robust predicting models for the six essential CYP450 enzymes by evaluating various machine learning and deep learning approaches with practical optimization techniques. In addition, we presented the most comprehensive and high-quality datasets of compounds with CYP450 enzyme interaction from all available resources. The high quality of the dataset proceeds through various artificial curation means. The results show that our highly curated dataset possesses competitive capacity on model training for CYP450 substrate prediction. After a systematic evaluation of various approaches, predicting models constructed by Graph Convolution Network (GCN) algorithm reached the highest performance, with Matthews correlation coefficients (MCCs) between 0.50(CYP2C19) and 0.72(CYP1A2). We hope these achievements are helpful for other in silico approaches on CYP450 mediated metabolism and support the compound screening section during drug development.

參考文獻


1. Furge, L. L., Guengerich, F. P. (2006). Cytochrome P450 enzymes in drug metabolism and chemical toxicology: An introduction. Biochemistry and Molecular Biology Education, 34(2), 66-74.
2. Nebert, D. W., Russell, D. W. (2002). Clinical importance of the cytochromes P450. The Lancet, 360(9340), 1155-1162.
3. Lynch, T., Neff, A. P. (2007). The effect of cytochrome P450 metabolism on drug response, interactions, and adverse effects. American family physician, 76(3), 391-396.
4. Zanger, U. M., Schwab, M. (2013). Cytochrome P450 enzymes in drug metabolism regulate gene expression, enzyme activities, and impact of genetic variation. Pharmacology therapeutics, 138(1), 103-141.
5. Lewis, D. F., Ito, Y. (2010). Human CYPs involved in drug metabolism: structures, substrates and binding affinities. Expert opinion on drug metabolism toxicology, 6(6), 661-674.

延伸閱讀