透析治療已成為全民健保的龐大負擔,而腎病變為影響糖尿病患者是否進入透析階段的主要因素。本研究利用資料探勘技術分析健保資料庫,探討未患有腎病變的糖尿病患者,於未來三年內發生腎病變,並進入透析階段(即糖尿病腎病變透析治療)之疾病危險因子。本研究利用健保資料庫進行回溯性世代研究,並透過集群減少多數抽樣技術(SBC)、分類迴歸樹(CART)與支援向量機(SVM)等技術,建立疾病危險因子分析模式。研究結果發現,當患者具備「糖尿病病程五年以上」、「增殖型視網膜病變」與「玻璃體出血」等分析模式所篩選之疾病危險因子時,其三年內進入透析階段的發生率與勝算比皆顯著較高。因此所提之分析模式,能夠有效的發揮資料探勘技術之特性,並減少資料類別不平衡的影響,找出有效的疾病危險因子,讓相關單位可對透析治療之高危險族群加強健康管理,減少健保負擔。
Dialysis treatment has become a huge burden on national health insurance. Nephropathy is a major technique to diagnose whether diabetic patients need dialysis treatment. The purpose of this study is to apply data mining techniques to analysis the databases of national health insurance to explore disease risk factors affecting diabetic patients without nephropathy and started dialysis treatment within next three years. The proposed disease risk factor analysis model composes three data mining techniques including under sampling based on clustering (SBC), classification and regression tree (CART) and support vector machine (SVM). Experimental results showed that three disease risk factors can be identified involving "diabetes of over 5-years duration", "Proliferative diabetic retinopathy", and "vitreous hemorrhages" are selected as important risk factors by using the proposed techniques. The diabetic patients with the three risk factors have higher incidences of dialysis than those without the three factors. The proposed model also manages the class imbalance problem and can be used to accurately find important disease risk factors and high-risk groups accordingly.