應用以約定值為基礎之演算法於關聯規則探勘

現存於大型資料庫的關聯規則探勘方式，大都利用支持度修剪策略來降低搜尋關聯規則的時間，但此策略於低支持度門檻時，無法有效的找出潛在有價值的樣式，而且因為支持度太低，導致額外的資源（例如記憶體）需求也過大；在高支持度門檻時，則會遺失具有低支持度，但卻有高信賴度與高相關性的樣式。本研究先證明約定值具有跨支持度特性，然後再利用此特性修剪及刪除沒有價值的項目集，加快演算法的執行速度與節省系統的資源，而且如果一個項目集其約定值大於最小約定值門檻，則這一個項目集的支持度會大於某一個程度的底限，由此項目集所延伸出來的關聯規則，其信賴度也會大於某一程度的底限，因此利用約定值所探勘出來的關聯規則是有價值的。本研究最後將此演算機制應用於真實之交易資料上，實驗結果顯示利用約定值跨支持度特性的修剪策略可以減少尋找大型項目集的時間，且所探勘出的大型項目集，其項目間也具有高度的相關性。

關鍵字

資料探勘；關聯規則；約定值；跨支持度

並列摘要

Most current methods of mining association rules for large database use support pruning strategy to reduce searching space of finding out association rules. However, the strategy is not efficient to mine valuable patterns because it consumes lots of resources when the support threshold is low. Meanwhile when the support threshold is high, it will lose valuable itemsets which have lower support, higher confidence, and higher correlation. This paper applies the concept of bond-based threshold to mine association rules for large databases. We first prove that the bond has a cross-support property and then use this property to prune invaluable itemsets. This can improve the efficiency of the algorithm and reserve system resources. If the bond of itemset is greater than the bond-based threshold, the support of this itemset would be greater than some limit. The confidence of the association rules produced by the itemset would also be greater than some limit. The itemset would have high correlation between individual items. Therefore, when we use both bond and support pruning strategy, the association rules will be valuable. Our experiments were performed on real data sets. The experimental results show that this approach can reduce search space and find the valuable patterns, and the valuable patterns have high correlation between individual items.

並列關鍵字

data mining ； association rules ； bond ； cross-support

參考文獻

丁一賢、陳牧言(2005)。資料探勘。台中:滄海書局。

Google Scholar

彭文正譯、Michael J. A. Berry、Gordon S. Linoff著(2001)。資料採礦　顧客關係管理暨電子行銷之應用。台北:數博網資訊股份有限公司。

Google Scholar

曾憲雄、蔡秀滿、蘇東興、曾秋蓉、王慶堯(2004)。資料探勘。台北:旗標出版股份有限公司。

Google Scholar

Aggarwal, C. C.,Procopiuc, C.,Yu, P. S.(2002).Finding Localized Associations in Market Basket Data.IEEE Transaction on Knowledge and Data Engineering.14(1),51-62.

Google Scholar

Agrawal, R.,Imielinski, T.,Swami, A.(1993).Mining Association Rules between Sets of Items in Large Databases.Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.(Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data).:

Google Scholar

被引用紀錄

蘇育群（2015）。應用平行關聯演算法於中式速食連鎖餐廳之套餐設計〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2015.00934

李晏華、黃冠凱、吳信宏（2022）。探討醫院異常事件通報病患發生跌倒事件之分析－以中部某區域教學醫院為例。品質學報，29(2)，99-117。https://doi.org/10.6220/joq.202204_29(2).0001

國際替代計量

應用以約定值為基礎之演算法於關聯規則探勘

全文下載

主題瀏覽