  • 學位論文


Taguchi Method in Feature Selection and Parameter Determination

指導教授 : 白炳豐


摘要 傳統的醫學診斷大多是醫生依據以往的經驗,進行疾病的診斷。隨著醫療系統的電腦化,隱藏資料庫中的資訊愈來愈豐富。倘若透過先進的分析技術,可找出病症特徵和疾病間的相關性,輔助醫生提高診斷之準確率。許多的研究已證實出應用機器學習的技術,對於提高疾病診斷的正確性有相當顯著的幫助。 支援向量機近年來已廣泛地應用於分類及預測的問題上,在使用支援向量機時,會面臨二個問題,一是輸入特徵之選擇,另一則是支援向量機之參數調整,如何同時可將特徵選擇最佳化與參數調整,將是一重要的課題。 本研究以UCI公開之醫療資料庫為基礎,使用特徵擷取方法,並透過田口實驗設計進行參數設計與特徵之篩選,以螞蟻演算法找出支援向量機之參數,建構出疾病診斷模式,也同時解決了支援向量機的二大問題。實驗結果顯示,本研究所提出的方法可以使用較少的特徵屬性及較佳的參數,提高整體分類之準確程度,並與倒傳遞類神經網路進行診斷成效之優劣比較。 關鍵詞:疾病診斷、特徵擷取、田口實驗設計、支援向量機、螞蟻演算法


Abstract In traditional medical diagnosis, doctors usually diagnose patients according to their experiences. Following the computerization of medical systems, there are more and more tacit information in it. If we can use the advanced method of data analysis to find the correlation between the disease feature and disease, it will help doctors for enhancing the accuracy of disease diagnosis. Many researchers have been confirmed that machine learning technology can improve disease diagnosis accuracy significantly. SVM (Support Vector Machine) has been generally used in classification and forecasting issues. There are two problems when using SVM, one is choosing the input feature, the other is parameters setting of SVM. How to find optimal feature selection and parameters setting at the same time will be an important topic. An approach we proposed is based on taguchi method for feature selection and parameter design, then applying ant algorithm to find the parameters of SVM. The used databases are medical datasets form University of California at Irvine (UCI) Machine Learning Repository. The experiment results show our proposed approach uses less features and better parameters to improve total accuracy and comparable with BPNN (Back Propagation Neural Network). Keywords:Disease Diagnosis、Feature Selection、Taguchi Method、Support Vector Machine、Ant Algorithm


Afshar, M. H.,(2006) “Improving the efficiency of ant algorithms using adaptive refinement: Application to storm water network design,” Advances in Water Resources., Vol. 29, pp. 1371–1382
Al-Aomar, R., (2006) “Incorporating robustness into Genetic Algorithm search of stochastic simulation outputs,” Simulation Modelling Practice and Theory., Vol. 14, pp. 201-223.
Arbach, L., Reinhardt, J.M., Bennett, D.L., and Fallouh, G., (2003) “Mammographic masses classification: comparison between backpropagation neural network (BNN), K nearest neighbors (KNN), and human readers,” Electrical and Computer Engineering, 2003. IEEE CCECE 2003. Canadian Conference on., Vol. 3, pp. 1441 - 1444
Bautista, J., and Pereira, J., (2007) “Ant algorithms for a time and space constrained assembly line balancing problem,” European Journal of Operational Research., Vol. 177, pp. 2016–2032.
