透過您的圖書館登入
IP:3.144.42.196
  • 學位論文

運用機器學習演算法對脂肪肝預測研究

Clinical Application of Machine Learning Algorithms to Predict Fatty Liver Disease

指導教授 : 李友專

摘要


近年由於醫療系統電腦化,資料量亦快速增加,資料探勘技術逐漸被應用於醫療診斷系統,從複雜的醫療檢驗紀錄中,選取有相關因子作為變數,並運用資料探勘中之機器學習法之分類技術如多層感知類神經網路、隨機森林、支持向量機及邏輯斯迴歸等,分類出疾病發生之可能性,在作儀器檢查和侵入性治療 (例如切片) 之前,提供醫師作為診斷決策參考。 並嚐試縮減過多之屬性,刪除非必要之特徵,及錯誤或遺失資料的處理及對機器學習法中之各參數作適當的選擇,使達到最適當執行效率。從而比對出輔助臨床醫學診斷最好的機器學習分類方法。藉由不同疾病資料不同之屬性調整不同之驗證與評估方法之應用了解何種模型較適合於使用在醫療診斷上之用途,如此將對醫療之診斷治療及預後預測是有相當助益的。 本研究採用使用WEKA 3.7統計分析中數個具有代表性的主要分類器如多層感知類神經網路 (MLP) 及支撐向量機 (SVM/SMO) 、隨機森林 (RF) 、邏輯斯迴歸分析等資料探勘技術,來對 10 種醫療診斷資料作分類預測與結果的比較分析。所選用的變數包括年齡、性別、腹圍、三酸甘油脂、空腹血糖、血清麩氨基酸草醋酸轉氨基酵素、血清麩丙酮酸轉氨基酵素、高密度脂蛋白膽固醇、收縮壓、舒張壓。探討引響模型預測分析脂肪肝與肝生化異常的相關性,有、無脂肪肝參數差異,同時採用腹部超音波作為診斷有無脂肪肝之依據,作分類預測與結果的比較分析,進行預測評估罹患脂肪肝。 本研究使用準確度、靈敏度、特異度以及接收者操作特徵曲線下面積 (AUROC) 等指標評估該預測模型效能,藉以比對並驗證隨機森林、支持向量機、類神經網路和邏輯迴歸中結果較佳者為預測脂肪肝決策支援工具。 預期將其所獲得較佳結果之分類分析模型,應用在進行臨床初步篩檢分析脂肪肝,不需要全數仰賴腹部超音波當作第一線篩檢脂肪肝之唯一工具,如此將大大減輕健保醫療負擔,並且能準確輔助醫師判斷有無脂肪肝以及預防肝臟產生後續之病變,並可達到早期發現及早治療之目標。

並列摘要


In recent years, the capacity of healthcare computer data system has been increased rapidly. In addition, the technology of data mining is gradually being used in medical diagnostic systems. For instance, before making an abdominal ultrasound examination and/or invasive intervention (biopsy) to classify the clinical suspicious of fatty liver disease, a physician can first using medical data mining solution by first selecting the attributes of the medical examination laboratory data from complicated medical records and take useful and related factors as variables. By incorporating machine learning classification technology such as Multilayer Perceptron Neural Network, Random Forests, Support Vector Machines and Logistic Regress, the software can provide a highly reliable reference to help clinical physicians to make an accurate diagnosis. For assisting a clinical diagnosis and medical treatment, we need to verify the best algorithm of machine learning by comparing their outcomes. Furthermore, based on the adjustment of different verifications, assessments and selected attributes, we also need to recognize which model of machine learning is more suitable for the clinical diagnosis purpose. This study is using liver biochemistry data and utilizing WEKA 3.7 to analyze statistical parameters between fatty liver and non-fatty livers. The parameters used are Age, Abdominal girdle, Triglyceride, Glucose AC, SGOT-AST, SGPT_ALT systolic blood pressure and diastolic blood pressure. Fatty liver was diagnosed based on the result of abdominal ultrasound and simultaneously applying fatty liver predictive model using techniques of Random forest, Support Vector Machine, Artificial Neural Network. The performances of the models will be evaluated according to parameters of accuracy, sensitivity, specificity and the area under receiver operating characteristic curve (AUROC). The result of this study will prevent severe liver disease by early detection and treatment, and can also help physicians to give an appropriate and efficient medical advice to their patients.

參考文獻


蕭敦仁. (2005). 職場肝功能異常原因暨肝病管理探討. 臺灣大學職業醫學與工業衛生研究所學位論文(2005 年), 1-90.
行政院衛生署國民健康局癌症登記線上互動查詢系統 (2011年版) 【資料檔】。臺北市:行政院衛生署國民健康局。
民國100年死因統計年報 (2012年版) 【資料檔】。臺北市:行政院衛生署。
陳榜. (2005). 脂肪肝. 臺灣中醫臨床醫學雜誌, 11(4), 273-279.
王素真, 洪耀釧, & 林耀三. (2011). 影響民眾利用自費健康檢查之相關因素探討. 工程科技與教育期刊, June, 8(2), 320-329.

延伸閱讀