降低資料參數以提高學習效率

進行資料探勘或機器學習時，過多的參數可能會影響學習的效率，而需花費大量的時間進行學習，但是在資料中並非每一個屬性參數對於整個資料都是必要的。粗糙集合理論(Rough Set Theory)在減少屬性參數上是一項有效的工具。本研究將使用粗糙集合理論參數折減方式(reduct)，將從UCI資料庫與ROSE2軟體的範例檔案中，篩選出的十六組資料降低屬性參數，再以人工類神經網路(ANN)、支向機(SVM)、貝氏網路、ID3與C4.5決策樹，這五種學習方法檢驗參數折減對學習效率是否提高，並且觀查對學習準確度之影響程度。經由實驗顯示，使用約略集合理論找出最小參數屬性後，在學習效率上確實有提昇，而且在學習準確度，也存有提昇的空間。

關鍵字

參數折減；機器學習；資料探勘；約略集合理論

並列摘要

For data mining or machine learning, the plethora of parameters that may affect the efficiency of learning, and spend a lot of time to study. Further more not all attributes are important for the data. Rough set theory is an effective tool to reduce the attributes. This study applies the reduced attributes of rough set theory approach. Using the UCI database and example of software ROSE2. 16 set of data were selected to reduce the attributes, and then the artificial neural network, Support vector machine, Bayesian network, ID3 and C4.5 decision tree learning methods are used for comparison. The result show that the application of rough set theory to find the smallest attribute subset for learning, indeed improves the learning efficiency, and accuracy is also improved.

並列關鍵字

Reduced Attribute ； Machine Learning ； Data Mining ； Rough Set Theory

參考文獻

[44] 陳生祥, 「運用資料探勘技術建構企業財務危機預警模式-結合財務與非財務資料」, 中原大學資訊管理學系, 碩士論文, 2004.

[2] F. H. Grupe, and M. M. Owrang, “Database Mining Discovering New Knowledge and Cooperative Advantage”, Information Systems Management, 1995, pp. 26-31.

[6] M. Y. Kiang, “A comparative assessment of classification methods,” Decision Support Systems, Vol. 35, pp.441-454, 2003.

[7] K. K. Ang and C. Quek, “Stock Trading Using RSPOP: A Novel Rough Set-Based Neuro-Fuzzy Approach,” IEEE Transactions on Neural Network, vol. 17, no. 5, Sep. 2006, pp. 1301-1315.

[8] F. M. Chang and C. C. Chan, “Improve Neuro-Fuzzy Learning by Attribute Reduction,” The 27th Annual Meeting f the North American Fuzzy Information Processing Society (NAFIPS'08), The Rockefeller University, NY, USA, May 18-21, 2008.

國際替代計量

降低資料參數以提高學習效率

未授權

主題瀏覽