代謝症候群是許多慢性疾病的前兆,患有MetS會增加諸如慢性腎病、冠心病及其他代謝性疾病(例如腦血管中風和糖尿病)等重要慢性疾病的風險。近年來機器學習技術蓬勃的發展,也已廣泛的應用於各種醫療資訊分析的議題上。為建構有效的預測模式,本研究基於美國素食者的國家健康和營養調查(NHANES)的MetS為研究的實證資料,應用機器學習中的邏輯斯迴歸(LGR)、支援向量機(SVM)、多元適應性雲型迴歸(MARS)和極限梯度提升(XGBoost)分類技術與內嵌法(LASSO)特徵選取技術;以及過採樣法(Over)的資料平衡技術建構整合式的多階段預測模式。本研究將所提之整合式多階段模式與單純模式的結果進行比較。實證結果顯示,所提之整合預測模式相較於單純預測模式為較佳,並且使用O-L-XGBoost整合預測模式的有較佳的預測結果,經由資料平衡技術後再變數選取能夠有效提升預測績效,證明本研究的方法為在MetS中之所有討論特徵的預測模型提供更合適的方法。
Metabolic syndrome (MetS) is a precursor to many chronic diseases. Suffering from MetS increases the risk of important chronic diseases, for example, chronic kidney disease, coronary heart disease, and other metabolic diseases such as cerebrovascular stroke and diabetes. In recent years, the vigorous development of machine learning technology has also been widely used in various medical information analysis issues. In order to construct an effective prediction model, this study is based on the American vegetarian of the national health and nutrition examination survey (NHANES) of MetS empirical data, the logistic regression (LGR), support vector machine (SVM), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) for classification technology in machine learning were applied; embedded method (Lasso) in feature selection technology; and oversampling method (Over) of the data balance technology constructs an integrated multi-stage prediction model. This study compares the results of the proposed integrated multi-stage model with the single model. The empirical results show that the proposed Integrated multi-stage prediction model is better than the single prediction model. The O-L-XGBoost integrated prediction model has better prediction results, and the feature selection after the data balance technique can effectively improve the prediction performance, it is demonstrated that the method of this study provides a more suitable method for prediction models of all the discussed features in MetS.