透過您的圖書館登入
IP:18.218.184.214
  • 期刊

應用準高頻項目之二元樹技術於漸進式資料探勘之研究

Design an Efficient Incremental Data Mining Algorithm Based on Pre_large Technique

摘要


本論文主要針對關聯式法則的漸進式探勘演算法進行研究,關聯式法則的漸進式探勘演算法即是希望能夠藉由保留某些上一次探勘後的相關資訊,不需要再度處理舊有資料便能夠維持關聯式法則的正確性與有效性。本研究運用尋訪速度快的二元樹結構,以及採用降低項目在高頻與非高頻之間變動機率的準高頻(Pre_large)觀念,在盡量減少掃描資料庫的次數情況下,提出一個有效率的、可靠的漸進式探勘演算法-Pre_large Descending frequent pattern Binary-tree Algorithm(PDBA),PDBA不但可以節省相當的探勘時間成本,並可以維持關聯式法則的有效性。除了提出PDBA演算法外,在研究中並比較PDBA演算法與前人所提出的漸進式探勘演算法(DFPBT、AFPIM),經由一個實例的實驗結果發現,在此實例中PDBA演算法的表現皆比DFPBT及AFPIM為優,可以節省至少23%的探勘時間。

並列摘要


The paper addresses the problem of developing an incremental data mining algorithm. We proposed an Pre_large Descending frequent pattern Binary-tree Algorithm(PDBA) to maintain the correctness of the association rules without re-processing the processed data by retaining the last-mined related information. Also, by using the binary tree data structure known for its fast speed of traversing and the Pre_large concept, PDBA cuts down the variation probability of items between high frequency and non-high frequency. PDBA also reduced the number of times needed to scan the database and brought up a more efficient and reliable incremental data mining results of the association rules. For verifying the performance of PDBA, we also implement PDBA along with DFPBT and AFPIM, to compare their efficiency. The experimental results show that PDBA outperforms DFPBT and AFPIM at least 23% in execution time.

延伸閱讀