以Large Itemsets之有效分類發掘關聯式規則

在現存關於關聯式規則(Association Rules)的資料發掘演算法(Data Mining Algorithm)中，當項目(Items)過多之時，項目之間的關聯將更形複雜，故其應用於資料發掘(Data Mining)的計算過程經常變得非常緩慢與費時。為改善發掘關聯式規則之運算速度，部份學者乃針對資料做適當的分類處理。其做法包括：(1)建構階層式關係與(2)屬性叢聚等方式對於資料做分類。在本文中吾人提出另一種分類方式：藉由最小信賴係數本身定義的特性，針對Large Itemsets進行分類；之後根據分類的結果改進產生關聯式規則的流程，以達到加速發掘關聯式規則運算的目的。

關鍵字

資料發掘；關聯式規則

並列摘要

On the existing algorithms for mining association rules of data mining, the correlations between items become quite complicated when items grow larger and larger. In that case, the computing process will take tremendous time. To improve the computing time for the above mention problem, some research deal with the items through properly distinguishing, such as (1) building hierarchical relation and (2) attribute clustering. In this thesis, another categorizing method is proposed. First we distinguish the large itemsets using the characteristic of definition of minimal confidence. Then, the processes of generating association rules are effectively reduced based on the above categorization. This method may improve the computing time quite efficiently.

並列關鍵字

Data Mining ； Association Rules ； KDD

參考文獻

[1]R. Agrawal, T. Imielinski, A. Swami, Mining Association Rules between Sets of Items in Large Databases, ACM SIGMOD, 5/93/Washington, DC, USA.

[3]R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules, Proc. of the 20th VLDB Conf.,Santiago, Chile, 1994

[5]M. Houtsma, A. Swami, Set-Oriented Mining for Association Rules In Relational Databases, IEEE 1995.

[6]J.-S. Pork, M.-S. Chen, P. S. Yu, An Effective Hash Based Algorithm for Mining Association Rules, ACM SIGMOD, P175-186; May 1995

[7]R. Agrawal, R. Srikant, Mining Sequence Patterns, Proc.1995 Int. Conf. Data Engineering, P3-114; March 1995

國際替代計量

以Large Itemsets之有效分類發掘關聯式規則

主題瀏覽