以調整FP樹狀結構為基礎之關聯規則漸進式探勘方法

近來有許多研究提出有效率從交易資料庫探勘出有用資訊的技術，其中探勘關聯規則即是被廣泛應用的探勘技術。隨著交易的持續進行，資料庫中的資料會隨時間變動，對於先前已探勘出來的關聯規則，有些在新資料庫中可能會變成無效，亦可能有新的規則產生。為了探勘異動後的資料庫中正確而完整的關聯規則。本論文提出AFPIM關聯規則漸進式探勘演法，以FP_tree的結構來儲存前次探勘時資料庫中和探勘相關的資訊，當資料庫發生增加或刪除交易資料等異動時，只需掃描異動的資料庫部份，隨之調整FP-tree結構，便能從調整後的FP-tree找出更新後資料庫的關聯規則，不需重新掃描整個資料庫。AFPIM演算法適用於資料庫新增、刪除及同時有新增和刪除交易資料時關聯規則的漸進式探勘。由實驗結果顯示在執行效率上，AFPIM演算法相較於之前已提出的漸進式探勘演算法(FUP及UWEP演算法)有明顯效率提昇，且在最小支持度門檻值很小的情形下，本演算法仍能維持不錯的執行效率。

關鍵字

資料勘探；關聯規則；規則維護；漸進式探勘； FP樹狀結構

並列摘要

In this thesis, the issue of mining and maintaining association rules in a large database of customer transactions is studied. The maintenance of association rules can be mapped into the problem of maintaining frequent itemsets in the database. Because the mining of association rules is time-consuming, we need an efficient approach to maintain the frequent itemsets when the database is updated. In this study, a general incremental updating technique is proposed for maintaining the frequent itemsets discovered in a database in the cases including insertion, deletion, and modification of transactions in the database. An efficient algorithm, called AFPIM (Adjusting FP-tree for Incremental Mining), is designed based on adjusting FP-tree structures. Our approach uses a FP-tree structure to store the compact information of transactions involving frequent and pre-frequent items in the original database. In most cases, the new FP-tree structure of the updated database can be obtained by adjusting FP-tree of the original database according to the changed transactions without needing to rescan the original database. Experimental results show that AFPIM outperforms the existing algorithms in terms of the execution time.

並列關鍵字

Data Mining ； Association Rule ； Rule Maintenance ； Incremental Mining ； FP-tree structure

參考文獻

[1]　R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rule in Large Databases,” in Proc. of The 20th International Conference on Very Large DataBases, 1994.

[4] D.W. Cheung, J. Han, V.T. Ng, and C.Y. Wong, “Maintenance of Discovered Association Rules in Large Databases: An Incremental Update Technique,” in Proc. of the 12th International Conference on Data Engineering (ICDE'96), February, pages106-114, 1996.

[7] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” in Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 1-12, Dallas, Texas, USA, 2000.

[8] G. Lee, K.L. Lee and A.L.P. Chen, “Efficient Graph-Based Algorithms for Discovering and Maintaining Association Rules in Large Databases, ” Knowledge and Information Systems, Springer-Verlag, Vol. 3, pages 338-355, 2001.

[11] P.S.M. Tsai, C.C. Lee, and A.L.P. Chen, “An Efficient Approach for Incremental Association Rule Mining,” in Third Pacific-Asia Conference, PAKDD-99 Proceedings, pages 74-83, 1999.

國際替代計量

以調整FP樹狀結構為基礎之關聯規則漸進式探勘方法

主題瀏覽