代謝症候群是一種集合心血管與代謝疾病的危險因子的聚集,其診斷和治療仍在國際間受高度重視。在肥胖兒童的數量隨著不健康的飲食和生活習慣上升,國際糖尿病聯盟(Internal Diabetes Foundation, IDF)在2007年公佈兒童和青少年代謝症候群的診斷標準,然而此診斷標準未必適合評估臺灣地區的兒童和青少年的代謝症候群。此研究目的為運用機器學習演算法找出有效用以評估臺灣地區的兒童和青少年的代謝症候群的預測模型。 在2,362位10至16歲兒童及青少年的健康報告中,以IDF的兒童和青少年代謝症候群的診斷標準,合併臺灣地區依據身體質量指數的肥胖定義,共81位符合代謝症候群的診斷標準。另外,從不符合診斷的2,281位的健康報告中,以隨機取樣得到107位列為不符合代謝症候群的診斷標準。採用的分析參數包括血壓、血糖、血脂肪、甲狀腺功能及血球分析等。以WEKA3.6中的決策樹、隨機森林、支持向量機、多層感向器及邏輯斯迴歸,共五種機器學習演算法進行分析及建置預測模型。 結果發現以支持向量機建置臺灣地區10至16歲兒童和青少年代謝症候群的預測模型,其Area Under Curve (AUC) 達0.967 相較其他機器學習演算法為最高,而準確度以支持向量機和決策樹為最佳達90.9%。此外,針對臺灣地區的兒童及青少年的代謝症候群診斷,從決策分類樹的結果發現BMI、三酸甘油酯、空腹血糖、舒張壓及低密度脂蛋白可讓診斷準確率達90.9%。
Metabolic syndrome consists of a cluster of the dangerous risk factors of cardiovascular diseases and diabetes. Due to the increasing prevalence of obesity in children related to unhealthy diet and lifestyle, the International Diabetes Federation (IDF) published diagnosis criteria of metabolic syndrome in children and adolescents in 2007. Yet, the IDF also recognized that such diagnostic criteria may not be applicable among the various racial, gender and age differences in this unique population subjected to development of adult physical and sexual characteristics. The aim of this study is using machine learning Algorithms to predict metabolic syndrome in Taiwanese children and adolescents for early screening and diagnosis. Total 2,362 medical health records of children and adolescents of 10 to 16 years of age from one health examination center are collected for this study, and there was 162 records enrolled for this analysis (81 records matched the diagnostic criteria of metabolic syndrome, another 107 records which did not matched the diagnostic criteria was extracted by using random sample selection. Five-fold cross-validation is used to evaluate our experiment results. The presence of metabolic syndrome is diagnosed based on criteria defined by the IDF and presence of obesity identified by body mass index (BMI) according to Taiwanese children and adolescent obesity definition published by the Ministry of Health and Welfare (Taiwan 2002). The study extracted eighteen features obtained from physical measurements and biochemical blood tests for prediction. The features include the following: BMI, blood pressure, fasting serum glucose (FG), lipid profile, thyroid function, and hemogram. For model construction, we apply WEKA 3.6, in which classifiers including decision trees (DT), random forests (RF), support vector machines (SVM), multilayer perceptron (MLP) and logistic regression (LR), are adopted to predict metabolic syndrome. We evaluated accuracy, sensitivity, specificity, and area under receiver operator characteristic curve (AUC) to assess predictive performance. Five-fold cross-validation is used to evaluate our experiment results. We conclude that applying support vector machine (LibSVM) to predict metabolic syndrome can serve as an effective method to assist in establishing a clinical decision making system with AUC with 0.967. Both SVM and decision trees can reached the highest accuracy rate with 90.9%. In addition, BMI, TG, FG, LDL and diastolic blood pressure are selected as the most effective features in the diagnosis of metabolic syndrome in children and adolescents between the ages of 10 and 16 years old in Taiwan.