漸進式區塊深度優先關聯法則探勘之研究

有鑑於傳統關聯法則之探勘方法，需要耗費大量時間來完成資料之探勘，過去雖有學者提出漸進式探勘架構，不過仍然無法避免舊有資料庫重複掃瞄。因此本論文提出一個運用項目資料結構與區塊深度優先之探勘策略，只需對交易資料庫進行一次掃瞄，建立探勘程序使用之資料結構，可避免反覆掃瞄資料庫，並且在產生關聯法則時，只需要針對必要項目進行比對。此外針對漸進式資料之動態資料庫，透過本演算法所提出的漸進探勘機制，利用過去探勘所記錄之資訊，可以避免對舊有資料進行重複掃瞄完成資料探勘。本論文並針對傳統演算法，利用實際的資料進行探勘效能之實驗，並且進行效能比較與分析。透過實驗結果顯示，本論文提出之演算法可以節省大量的探勘時間。

關鍵字

資料探勘；關聯式法則；漸進式探勘

並列摘要

Data mining technique and application has received a lot of attention in the past decade. And finding out the association rules among data is one of the hot topics of data mining. By applying data mining technique, we can get valuable information from large size of raw data efficiently. But with the evolution of computer technology, the data grow constantly in time and the time spent in finding the valuable information is growth sharply. Therefore, how to design an efficient data mining scheme is extremely important. This paper focuses on the important issue and proposes an I-BDFS (Incremental Block Depth First Search) algorithm to resolve the problem. In I-BDFS algorithm, the raw data only needed to be scanned once instead of reduplicate of database scanned in previous algorithms. The proposed algorithm also can quickly generate large itemset by necessary intersection item, so the algorithm can save lots of execution time when mining. Moreover, with the help of designed structure in I-BDFS, the proposed algorithm need only to mine the necessary specific patterns to save scanning and comparison time. At last, in the paper, we conduct several experiments with real data to evaluate the performance of I-BDFS as well as some traditional algorithms. And the experimental results show that I-BDFS algorithm indeed has better performance compared with those traditional algorithms.

並列關鍵字

Data Mining ； Association Rules ； Incremental

參考文獻

Agrawal, R.,Srikant, R.(1994).Fast algorithms for mining association rules in large database.Proceedings of 20th International conference on Very Large Database.(Proceedings of 20th International conference on Very Large Database).

Google Scholar

Agrawal, R.,Srikant, R.(1995).Mining Sequential Patterns.Proceedings of 11th International conference on Data Engineering.(Proceedings of 11th International conference on Data Engineering).

Google Scholar

Chen, Ming-Syan,Han, Jiawei,Yu, P. S.(1996).Data mining: an overview from a database perspective.IEEE Transactions on Knowledge and Data Engineering.8(6),866-883.

Google Scholar

Cheung, D. W.,Han, Jiawei,Ng, V. T.,Wong, C. Y.(1996).Maintenance of discovered association rules in large databases: an incremental updating technique.Proceedings of the Twelfth International Conference on Data Engineering.(Proceedings of the Twelfth International Conference on Data Engineering).

Google Scholar

Cheung, D. W.,Lee, S. D.,Kao, B.,Wong, C. Y.(1997).A general Incremental Technique for Maintaining Discovered Association Rules.Proceedings of the fifth International Conference on Data Engineering.(Proceedings of the fifth International Conference on Data Engineering).

Google Scholar

國際替代計量

漸進式區塊深度優先關聯法則探勘之研究

全文下載

主題瀏覽