Title

整合藥物特性預測藥物副作用之研究

Translated Titles

Integrating Bio-medical properties to predict drug side effects

Authors

張軒豪

Key Words

相關性分析 ; 藥物分類 ; 藥物分群 ; 藥物副作用 ; 資料探勘 ; similarity analysis ; drug classification ; drug clustering ; data mining ; drug side effects

PublicationName

中山大學資訊管理學系研究所學位論文

Volume or Term/Year and Month of Publication

2016年

Academic Degree Category

碩士

Advisor

李偉柏

Content Language

繁體中文

Chinese Abstract

服用藥物常常伴隨副作用(Side Effect)發生的風險,根據統計美國藥物不良反應致死人數每年已經上升到100,000人之多,躍身美國第四大死因 [1]。醫療報告顯示住院病人有6.7%會產生嚴重的不良反應,其中在美國有0.32%的住院病患死於副作用,住院病患中因為副作用造成死亡位居第四名與第六名 [2]。 為了改善藥物與副作用之間產生的不良反應(Adverse Drug Reaction;ADR),本研究透過蒐集藥物的兩種藥物特性資料,其中包括生物特性和化學領特性。研究結合資料探勘(Data Mining)的方法,把藥物的相關性資料,先從序列分析以及化學結構的相似度建立分群,找出在群集中最接近的距離群集數以及在不同權重下的分群結果,目的為了得到最佳的分群效果。並且透過分類模型訓練相關的藥物資料預測副作用的發生,藉著藥物之間的分類方法有效地預測副作用產生的危害。根據分類的結果進行相似度的分析,找出影響藥物與副作用發生的關鍵因素而導致副作用的形成,透過視覺化的呈現讓ㄧ般大眾能夠輕易了解,使藥物與副作用之間的關係能被更加掌握,消除藥物之間的資訊不對稱。藥物在分類上的結果進一步的探討藥物副作用樣本之間的不平衡情況,本研究經過多次的觀察提出三種策略解決不平衡。分別對樣本不平衡的程度劃分三個副作用區段個別應用不同策略。研究的目的是希望經過結合兩者特性的藥物資料預測副作用(Side Effect)發生的情況加入計算領域的知識,綜合且宏觀的分析結果預測有別於以往的研究。

English Abstract

Taking medications often means taking the risks of side effects caused by various drugs. According to the medical reports, the number of patients died for adverse reactions each year has risen up to about 100,000-- the fourth among the top ten leading causes of death in the United States [1]. It is thus most essential for the medical professionals to be able to predict any likely drug side effects when prescribing new drug(s) for the patient [2]. In this study, we collect various biological and chemical properties of drugs from resources available, and use them as data features for prediction of any further side effects. We adopt a data mining approach using data classification and data clustering to evaluate the importance of different data features and to make prediction accordingly. After a series of quantitative analysis, we develop a new hybrid approach that combines the characteristics of different methods to exploit their advantages. Many sets of experiments have been conducted on the popular datasets to verify the proposed approach, and the results show that the proposed approach outperforms others. Moreover, we calculate the associations between drug features and side effects and perform data visualization to illustrate these associations. Some cases are analyzed and discussed in depth. With such a data visualization, users can now easily inspect the causes of drug features and side effects.

Topic Category 管理學院 > 資訊管理學系研究所
社會科學 > 管理學
Reference
  1. [3] Fei Wang , Ping Zhang , Nan Cao, Jianying Hu, Robert Sorrentino, "Exploring the associations between drug side-effects and therapeutic indications," no. 51, pp. 15-23, 2014.
    連結:
  2. [4] Wu, T.Y., Jen, M.H., Bottle, A., Molokhia, M., Aylin, P., Bell, D., and Majeed, "Ten-year trends in hospital admissions for adverse drug reactions in England 1999-2009," J. R. Soc. Med, no. 103, p. 239–250, 2010.
    連結:
  3. [6] Atias N, Sharan R., "An algorithmic framework for predicting side effects of drugs.," J Comput Biol , no. 18, p. 207–18, 2011.
    連結:
  4. [7] Pauwels E, Stoven V, Yamanishi Y., "Predicting drug side-effect profiles: a chemical fragment-based approach.," BMC Bioinform, no. 12, p. 169, 2011.
    連結:
  5. [8] Scheiber J, Jenkins JL, Sukuru SCK, Bender A, Mikhailov D, Milik M, et al., "Mapping adverse drug reactions in chemical space.," J Med Chem, no. 52, p. 3103–7, 2009.
    連結:
  6. [9] Li J, Zhu X, Chen JY. , "Building disease-specific drug-protein connectivity maps from molecular interaction networks and pubmed abstracts.," PLoS Comput Biol, no. 5, p. e1000450, 2009.
    連結:
  7. [10] Kotelnikova E, Yuryev A, Mazo I, Daraselia N, "Computational approaches for drug repositioning and combination therapy design," J Bioinform Comput Biol, no. 8, p. 593–606, 2010.
    連結:
  8. [11] Scheiber J, Chen B, Milik M, Sukuru SCK, Bender A, Mikhailov D, et al., "Gaining insight into off-target mediated effects of drug candidates with a comprehensive systems chemical biology analysis.," J Chem Inform Model, no. 49, pp. 308-17, 2009.
    連結:
  9. [12] Xie L, Li J, Xie L, Bourne PE., "Drug discovery using chemical systems biology:identification of the protein-ligand binding network to explain the side effects of cetp inhibitors.," PLoS Comput Biol, no. 5, p. e1000387, 2009.
    連結:
  10. [13] Hu G, Agarwal P. Human, "disease-drug network based on genomic expression profiles.," PLoS One, no. 4, p. e6536, 2009.
    連結:
  11. [14] Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, et al. , "Discovery and preclinical validation of drug indications using compendia of public gene expression data," Sci Translat Med, no. 3, p. 96ra77, 2011.
    連結:
  12. [15] C. Tung, "ChemDIS: a chemical–disease inference system based on chemical–protein interactions," Tung. J Cheminform, 2015.
    連結:
  13. [16] Miquel Duran-Frigola , Patrick Aloy, "Analysis of Chemical and Biological Features Yields Mechanistic Insights into Drug Side Effects," Chemistry & Biology, no. 20, p. 594–603, 2013.
    連結:
  14. [17] WenZhang, HuaZou , LongqiangLuo , QianchaoLiu , WeijianWu , WenyiXiao, "Predicting potential side effects of drugs by recommender methods and ensemble learning," Neurocomputing, 2015.
    連結:
  15. [18] Smith, T.F., Waterman, M.S., Burks, C., "The statistical distribution of nucleic acid similarities," Nucleic Acids Res, no. 13, p. 645–656, 1985.
    連結:
  16. [19] U. v. Luxburg, “A Tutorial on Spectral Clustering,” 2007.
    連結:
  17. [22] Sayaka Mizutani,Edouard Pauwels,Véronique Stoven,Susumu Goto,Yoshihiro Yamanishi, "Relating drug–protein interaction network with drug side effects," Bioinformatics, no. 28 (18), 2012.
    連結:
  18. [23] Mei Liu, Yonghui Wu, Yukun Chen, Jingchun Sun, Zhongming Zhao, Xue-wen Chen, Michael Edwin Matheny, Hua Xu, “Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs,” TRANSLATIONAL BIOINFORMATICS, 2012.
    連結:
  19. [24] Cheng F, Li W, Wang X, Zhou Y, Wu Z, Shen J, Tang Y., "Adverse drug events: database construction and in silico prediction.," J Chem Inf Model, no. 53(4), pp. 744-52, 2013.
    連結:
  20. [26] Chou CL, Hsu CC, Chou CY, Chen TJ, Chou LF, Chou YC, “Tablet splitting of narrow therapeutic index drugs: a nationwide survey in Taiwan,” Int J Clin Pharm, 2015.
    連結:
  21. [27] Lan CC, Liu CC, Lin CH, Lan TY, McInnis MG, Chan CH, Lan TH, “A reduced risk of stroke with lithium exposure in bipolar disorder: a population-based retrospective cohort study,” Bipolar Disord, 2015.
    連結:
  22. [29] Tang KT, Lin CH, Chen HH, Chen YH, Chen DY, “Suicidal drug overdose in patients with systemic lupus erythematosus, a nationwide population-based case-control study.,” Lupus, 2015.
    連結:
  23. [32] Z. MJ., "SPADE: an efficient algorithm for mining frequent sequences.," Mach Learn, no. 42(1-2), pp. 31-60, 2001.
    連結:
  24. [33] W. DuMouchel, "Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system," Am Statistician, no. 53, p. 177–90, 1999.
    連結:
  25. [34] CHao Ou-Yang,Sheila Agustianty, Han-Cheng Wang, "Developing a data mining approach to investigate association between physician prescription and patient outcome - A study on re-hospitalization in Stevens- Johnson Syndrome," Computer Methods and Programs in BIOMEDICINE, no. 112(1), pp. 84-91, 2013.
    連結:
  26. [35] Aileen P. Wright , Adam T. Wright , Allision B. McCoy , Dean F. Sittig, "The use of sequential pattern mining to predict next prescribed medications," Journal of Biomedical Informatics, no. 53, pp. 73-80, 2015.
    連結:
  27. [36] Zhengxing Huang , Wei Dong , Lei Ji , Chenxi Gan , Xudong Lu , Huilong Duan, "Discovery of clinical pathway patterns from event logs using probabilistic topic model," Journal of Biomedical Informatics, no. 47, pp. 39-57, 2014.
    連結:
  28. [37] Blei DM, Ng AY, Jordan MI, "Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation.," J Mach Learn Res, no. 3, pp. 993-1022, 2003.
    連結:
  29. [38] Li, H. and J.Y. Chen, “Improved Biomedical Document Retrieval System with PubMed Term Statistics an Expansions(accepted),” International Journal of Computational Intelligence in Bioinformatics and Systems Biology, 2008.
    連結:
  30. [39] Korbel, J.O., et al, "Systematic association of genes to phenotypes by genome and literature mining," PLoS Biology, no. 3(5), p. e134, 2005.
    連結:
  31. [40] Jiao Li , Xiaoyan Zhu , Jake Yue Chen, “Mining Disease-Specific Molecular Association Profiles from Biomedical Literature:A Case Study,” ACM symposium on Applied computing, pp. 1287-1291, 2008.
    連結:
  32. [42] Q. W. Rong Xu, "Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature," Journal of Biomedical Informatics, no. 51, pp. 191-199, 2014.
    連結:
  33. [43] Rong Xu , QuanQiu Wang, "Combining automatic table classification and relationship extraction in extracting anticancer drug-sider effect pairs from full-text articles," Journal of Biomedical Informatics, no. 53, pp. 128-135, 2015.
    連結:
  34. [44] H. L. ,. L. Y. ,. W. J. W. Sun Kim, "Extracting drug–drug interactions from literature using a rich feature-based linear kernel approach," Journal of Biomedical Informatics, no. 55, pp. 23-30, 2015.
    連結:
  35. [45] Sarvnaz Karimi , Alejandro Metke-Jimenez , Madonna Kemp , Chen Wang, "CADEC: A corpus of adverse drug event annotations," Journal of Biomedical Informatics, no. 55, pp. 73-81, 2015.
    連結:
  36. [46] Abeed Sarker , Graciela Gonzalez, "Portable automatic text classification for adverse drug reaction detection via multi-corpus training," Journal of Biomedical Informatics, no. 53, pp. 196-207, 2015.
    連結:
  37. [47] Ming Yang , Melody Kiang ,Wei Shang, "Filtering big data from social media - Building an early warning system for adverse drug reactions," no. 54, pp. 230-240, 2015.
    連結:
  38. [48] Freifeld CC, Brownstein JS, Menone CM, Bao W, Filice R, Kass-Hout T, Dasgupta N., "Digital Drug Safety Surveillance: Monitoring Pharmaceutical Products in Twitter," no. 37(7), p. 555, 2014.
    連結:
  39. [50] Serkan Ayvaz et al., "Toward a complete dataset of drug–drug interaction information from publicly available sources," Journal of Biomedical Informatics, no. 55, pp. 206-217, 2015.
    連結:
  40. [52] Ping Zhang, Pankaj Agarwal, and Zoran Obradovic, "Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources," Lecture Notes in Computer Science , vol. 8190, pp. 579-594, 2013.
    連結:
  41. [1] Giacomini KM, Krauss RM, Roden DM, Eichelbaum M, Hayden MR, NakamuraY, "When good drugs go bad," Nature, no. 446, p. 975–7., 2007.
  42. [2] J. Lazorou, B.H. Pomeranz, P.N. Corey, "Incidence of adverse drug reactions in hospitalized patients," Journal of the American Medical Association, no. 279(15), pp. 1200-1205.
  43. [5] Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, et al. , "Predicting new molecular targets for known drugs.," Nature , no. 462, p. 175–81, 2009.
  44. [20] S. M. OMOHUNDRO, “Five Balltree Construction”.
  45. [21] Hans.Krebs, "The citric acid cycle," 1953.
  46. [25] Srinivasan V Iyer, Rave Harpaz, Paea LePendu, Anna Bauer-Mehren, Nigam H Shah, “Mining clinical text for signals of adverse drug-drug interactions,” Journal of the American Medical Informatics Association, pp. 353-362, 2014.
  47. [28] Chao TF, Wang KL, Liu CJ, Lin YJ, Chang SL, Lo LW, Hu YF, Tuan TC, Chung FP, Liao JN, Chen TJ, Chiang CE, Lip GY, Chen SA, “Age Threshold for Increased Stroke Risk Among Patients With Atrial Fibrillation: A Nationwide Cohort Study From Taiwan,” J Am Coll Cardiol, 2015.
  48. [30] M.H. Kuo, A.W. Kushniruk, E.M. Borycki, D. Greig, “Application of the Apriori algorithm for adverse drug reaction detection, in: R. Beuscart, W. Hackl, C. Nøhr (Eds.), Detection and Prevention of Adverse Drug Events –Information Technologies and Human Factors, IOS Press, Amsterdam,Netherlands,” pp. 95-101, 2009.
  49. [31] Buchta C, Hahsler M, Buchta MC, “Package ‘arulesSequences’,” 2012.
  50. [41] Rui Zhang , Michael J. Cairelli , Marcelo Fiszman , Graciela Rosemblat , Halil Kilicoglu , Thomas C. Rindflesch , Serguei V. Pakhomov , Genevieve B. Melton, "Using semantic predications to uncover drug–drug interactions in clinical data," Journal of Biomedical Informatics, no. 49, pp. 134-147, 2014.
  51. [49] Abeed Sarker , Rachel Ginn , Azadeh Nikfarjam , Karen O''Connor , Karen Smit , Swetha Jayaraman , Tejaswi Upadhaya , Graciela Gonzalez, "Utilizing social media data for pharmacovigilance: A review," Journal of Biomedical Informatics, no. 54, pp. 202-212, 2015.
  52. [51] Delroy Cameron , Gary A. Smith , Raminta Daniulaityte , Amit P. Sheth ,Drashti Dave , Lu Chen , Gaurish Anand , Robert Carlson , Kera Z. Watkins ,Russel Falck, "PREDOSE: A semantic web platform for drug abuse epidemiology using social medial," Journal of Biomedical Informatics, no. 46, pp. 985-997, 2013.