近來來資料探勘的工具與使用方法日新月異,較常見的探勘方式為單一分類器與多項分類器做結合的模型,而近期多項分類器的集成式技術被廣為探討。因此本文的研究目的主要是以集成式技術中的Boosting作為分類評估的方法,此方法是用多個基底分類器做分類,再以相同權重方式投票,整合出一個較佳的模型。目的是比較單一分類器與結合多層基底分類器之優劣。本研究使用三種方法分別為簡單貝氏分類、決策樹中的J48方法、支援向量機做為單一分類器,並使用Boosting將此三種單一分類器分別做為基底分類器與三種方法的相互結合,總共分為五大類模型。使用UCI四個資料庫做為實驗的測試與評估,使用WEKA數據工具軟體進行測試。其研究結果顯示集成式演算法的分類器模型較優於單一分類器,使用多個不同分類演算法來建立集成架構相對於使用單一分類演算法的分類正確率有較佳的分類結果。
Recently, to use data mining tools and methods each passing day, the more common way of mining a single classifier and combining multiple classifiers to make a model, but many of its recent classifier integrated technology has been widely discussed. So the main purpose of this study is based on integrated technology of Boosting as a classification assessment methods, this method is used to classify the plurality of base classifiers, and then vote in the same way the weights, a better model of integration. Aimed to compare a single classifier and combining the merits of multilayer substrate classification. In this study, three methods were simple Bayesian classification, the J48 decision tree method, support vector machine as a single classifier, and using this three kinds Boosting single classifier as the base classifier, respectively, with the three methods combined with each other, total is divided into five categories model. Using UCI four database as experimental testing and evaluation, using WEKA software tool for testing. The results showed that integrated classifier algorithm better than single classifier model, using several different classification algorithms to establish an integrated framework relative to use of a single classification algorithms have better classification accuracy of classification results.