混合資料探勘技術於資料分析之研究

支援向量機(Support Vector Machines, SVM)是近幾年來最被廣泛應用的分類方法之一，其主要的理論是來自統計學習理論中結構化風險最小誤差法(Structural Risk Minimization, SRM)為新一代的學習演算法，此演算法已廣泛的使用在各種不同領域，例如生物資訊、影像分析、手寫識別、日常生活中的異常現象判斷，信用卡盜刷、監視影像偵測等。而支援向量機的分類準確度優於最大概似法，精準度值較高也較穩定，不會像最大概似法有高低震盪的情形發生。而且就影像個別類別的區塊化能力來說，也是以支援向量機的成果較佳。因此本文利用資料探勘(Data Mining, DM)的方法使用貝氏網路(Bayesian Network, BN)、支援向量機(Support Vector Machines, SVM)與決策樹(Decision Tree, DT)做屬性篩選，再分別結合支援向量機(Support Vector Machines, SVM)，進行分類，進而分析UCI(University of California – Irvine)的四個資料庫，並和先前學者所寫的論文結果做比較。研究結果發現決策樹結合支援向量機後更能提升分類的正確率。

關鍵字

支援向量機；貝氏網路；決策樹；資料探勘

並列摘要

Support Vector Machines (SVM) has been the most commonly used classification method in recent years. Its main theory originated from Structural Risk Minimization (SRM), a new-generation learning algorithm based on statistical learning theories. These algorithms are currently applied in various fields, including bioinformatics, image analysis, handwriting recognition, daily life anomaly analysis, credit card fraud, and surveillance video detection. Classification through SVM is more accurate and more stable than maximum likelihood estimation (MLE), and does not have frequent inconsistencies like MLE. SVM is also more effective in terms of image segmentation. Thus, this study used DM to incorporate SVM for classification, and used Bayesian Networks (BN) and Decision Trees (DT) to analyze 4 UCI (University of California – Irvine) databases and compared the results with past studies. Results showed that the integration of SVM and DT improved the accuracy rate of classification. Thus, the use of this method to establish a classification system is valid.

並列關鍵字

Support Vector Machines (SVM ； Bayesian Networks (BN) ； Decision Trees (DT) ； Data Mining (DM)

參考文獻

6.蘇祐萱 (2000)，貝氏網路於輔助盈餘預估分析之研究，元智大學資訊研究所資訊管理組碩士論文。

7.Barrientos, M.A., & Vargas, J.E. (1998). “A framework for analysis of dynamic processes based on Bayesian networks and case-based reasoning.” Expert System with Applications, Vol. 15, pp.287-294.

14.S. R. Gunn, (1998). “Support Vector machines for classification and regression,” Technical Report University of Southampton.

15.Thorsten, J. (1998). “Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, In proceedings of the European Conference on Machine Learning, pp.137-142.

16.Corinna Cortes & V. Vapnik, (1995). “Support-Vector Networks ”,　 Machine Learning, 20.

被引用紀錄

陳威頤（2014）。應用Dagging集成式學習演算法改善分類準確度之研究〔碩士論文，國立虎尾科技大學〕。華藝線上圖書館。https://doi.org/10.6827/NFU.2014.00169

彭健瑜（2013）。應用集成式技術的推進法於資料分類準確度之研究〔碩士論文，國立虎尾科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0028-1408201314530700

國際替代計量

混合資料探勘技術於資料分析之研究

未授權

主題瀏覽