透過您的圖書館登入
IP:3.133.144.197
  • 學位論文

利用關聯式法則改善文件分類準確度-結合其他分類器

Using Association Classification Rules to Improve The Accuracy of Text Categorization with Different Classifiers

指導教授 : 黃連進

摘要


在使用(Associative Classification, AC)做分類時,通常會將無法利用AC分類的資料,直接歸類到一個預先設定的類別,以避免資無法被分類的問題。但在使用AC建立分類器時,最容易遇到規則建立後門檻值設定的問題,定得太高會將很多可能有用的規則刪除而造成許多test cases不能分類,而太低又容易產生分類錯誤,這些情形都會影響到分類準確性。為了解決上述問題,提升分類結果的準確度,我們提出同時使用兩種不同分類器的概念,依據分類器特性,在不同階段做不同的事。本論文將利用KNN或貝氏分類器對文件做初步分類,然後利用所得之分類結果設定各種門檻值來篩選出滿足門檻值條件的關聯式分類法則(Associative Classification Rules, ACR),由於這些ACR之準確度皆高於初步分類的結果,我們可利用此特性篩選出ACR來進一步改善分類的結果。針對ACR不能分類的文件,則以KNN或貝氏分類器計算詞彙權重來分類,因此可減少規則產生的時間及數量進而加快分類速度,所以結合不同的分類器的優點則可有效提升文件分類的效能。經由實驗證明,使用本論文提出之結合兩種不同分類器的確可獲得比單一分類器更好的分類效能。

關鍵字

關聯式 分類器 中文 文字

並列摘要


In recent years many wireless broadband networking technologies were brought up and discussed. IEEE 802.16j is one of the most impressive one with its MMR network structure. Through the mechanism of Relay Station, high-cost base station could be substituted to give a broader network coverage and bigger bandwidth. Therefore the choice of location of these relay stations has become a topic that is worth discussing. This dissertation will start from the present IEEE 802.16 network, to figure out a relay station placement mechanism. Inside the acceptable range of the base station, find the best and most efficient sub-area for relay station by considering the differences in data flow and attributed bandwidth. The findings of this research will be verified through a series of experiments, and to finally conclude with the most suitable rule for the placement of relay stations

並列關鍵字

Association Classification Chinese Text.

參考文獻


[1] F. THABTAH, “A review of associative classification mining,” Knowl. Eng. Rev., vol. 22, 2007, pp. 37-65.
[2] P.G. Elena Baralis, “A Lazy Approach to Pruning Classification Rules,” Dec. 2002.
[9] W. Li, J. Han, and J. Pei, “CMAR: accurate and efficient classification based on multiple class-association rules,” Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 376, 369.
[10] X. Jiawei, “CPAR: Classification based on Predictive Association Rules.”
[12] R. Schapire, Y. Freund, P. Bartlett, and W. Lee, “Boosting the margin: a new explanation for the effectiveness of voting methods,” The Annals of Statistics, vol. 26, 1998, pp. 1686, 1651.

延伸閱讀