透過您的圖書館登入
IP:18.220.248.77
  • 學位論文

多層次規則優先度排序對關聯式分類效能影響

The Impact of Performance with Multi-Level Rule Priority for Associative Classification

指導教授 : 蔣定安

摘要


關聯式法則(Associative Rule)應用在文件分類(Text Categorization)時,一般在規則排序(ranking)上主要依據信賴度(confidence)由高到低、支援度(support)由高到低、規則長度由短至長來排序,然而文件分類時會遇到的多重規則問題,一般的研究多半忽略或很少有相關探討,本論文將以一般的排序方式加入規則條件,探討規則對文件分類效能的影響。 本論文將以Reuters21578文件集加以實作,利用關聯式法則(Association Rule)找出所有共同出現在一篇文件中的規則,以Lazy法篩選並排序規則,分別統計所有規則出現在各規則的情況以定出所有規則的優先順序,最後根據規則優先度調整順序後的規則建立分類器,以未知規則的測試文件驗證分類效能,並觀察不同的規則順序是否能改善分類結果。

並列摘要


Applying Associative Rule on Text Classification, the rule ranking is generally in accordance with confidence, support and length of rules. However, most recent researches often ignore the issue of multiple classes, this study will adopt the general ranking with the condition of class and will have a discussion on the effect of text classification with our ranking method. Our data source is Reuters 21578 collection and the implementation steps as follow: 1.we will adopt Association Rule to discover all frequent ruleitems; 2. to prune and rank the rules by Lazy method; 3.to figure out all rule frequencies of each class for deciding the sequence of classes; 4.to build the associative classifier according to the class priority; 5.classifiy unseen test documents to verify the performance and have an observation of various class priority whether our method could improve the accuracy of associative classification or not.

參考文獻


[1] F. THABTAH, “A review of associative classification mining,” Knowl. Eng. Rev., vol. 22, 2007, pp. 37-65.
[8] P.G. Elena Baralis, “A Lazy Approach to Pruning Classification Rules,” Dec. 2002..
[10] W. Li, J. Han, and J. Pei, “CMAR: accurate and efficient classification based on multiple class-association rules,” Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 376, 369.
[2] B. Liu, W. Hsu, and Y. Ma, “Integrating Classification and Association Rule Mining,” Knowledge Discovery and Data Mining, 1998, pp. 86, 80.
[3] Yongwook yoon, Gary G. Lee, Tseng, “Text Categorization Based on Boosting Association Rules,” Semantic Computing 2008 IEEE International Conference on, 2008, pp. 136-143.

延伸閱讀