透過您的圖書館登入
IP:18.221.239.148
  • 學位論文

應用集成式技術的推進法於資料分類準確度之研究

Study on Application of Using Boosting for Improving Accuracy

指導教授 : 顧瑞祥
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近來來資料探勘的工具與使用方法日新月異,較常見的探勘方式為單一分類器與多項分類器做結合的模型,而近期多項分類器的集成式技術被廣為探討。因此本文的研究目的主要是以集成式技術中的Boosting作為分類評估的方法,此方法是用多個基底分類器做分類,再以相同權重方式投票,整合出一個較佳的模型。目的是比較單一分類器與結合多層基底分類器之優劣。本研究使用三種方法分別為簡單貝氏分類、決策樹中的J48方法、支援向量機做為單一分類器,並使用Boosting將此三種單一分類器分別做為基底分類器與三種方法的相互結合,總共分為五大類模型。使用UCI四個資料庫做為實驗的測試與評估,使用WEKA數據工具軟體進行測試。其研究結果顯示集成式演算法的分類器模型較優於單一分類器,使用多個不同分類演算法來建立集成架構相對於使用單一分類演算法的分類正確率有較佳的分類結果。

並列摘要


Recently, to use data mining tools and methods each passing day, the more common way of mining a single classifier and combining multiple classifiers to make a model, but many of its recent classifier integrated technology has been widely discussed. So the main purpose of this study is based on integrated technology of Boosting as a classification assessment methods, this method is used to classify the plurality of base classifiers, and then vote in the same way the weights, a better model of integration. Aimed to compare a single classifier and combining the merits of multilayer substrate classification. In this study, three methods were simple Bayesian classification, the J48 decision tree method, support vector machine as a single classifier, and using this three kinds Boosting single classifier as the base classifier, respectively, with the three methods combined with each other, total is divided into five categories model. Using UCI four database as experimental testing and evaluation, using WEKA software tool for testing. The results showed that integrated classifier algorithm better than single classifier model, using several different classification algorithms to establish an integrated framework relative to use of a single classification algorithms have better classification accuracy of classification results.

並列關鍵字

Data Mining Boosting Ensemble Decision Tree WEKA

參考文獻


5.蔡婷鈺、葉怡成、鄒明誠和李振民,2007,“以六種資料探勘方法分析影響集集大地震引起山崩之重要因子”,中華林學季刊,40(1),頁69-79。
7.葉宣萱,2011,“消費金融無擔保客戶違約協商後毀諾─資料探勘技術之應用”,臺灣金融財務季刊,第十二輯,第一期。
8.蕭漢威、楊錦生、魏志平、馬淑貞,2007,“以網路流量資料探勘進行阻斷服務攻擊偵測之研究”,資訊管理學報,第十四卷,第二期,頁1-25。
1.吳郁珊,2009,混合資料探勘技術於資料分析之研究,國立虎尾科技大學研究所碩士論文。
4.姚志成,2005,“運用資料探勘寄數建構脂肪肝預測模式”,中原大學資訊管理系,碩士論文。

延伸閱讀