透過您的圖書館登入
IP:18.216.1.197
  • 學位論文

應用基因演算法及權重項目法於關聯法則挖掘之研究

Applying Genetic Algorithm and Weight Item to Association Rule

指導教授 : 蔡介元
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


關聯規則演算法是資料探勘 (Data Mining)中一項相當重要且實用的技術。過去的文獻中,關聯規則大多應用於購物籃分析上,用以找出不同項目間的關聯性。故若能有效地利用關聯規則的技術,便能找出大量資料中所隱含的資訊,來提供決策者更好的決策依據及提高企業的競爭力。然而,大部分的關聯規則應用上,通常僅考慮不同項目組合所產生的關聯性,並未充分的利用資料庫中相關的項目屬性,來提昇挖掘出之關聯規則的品質。此外,管理者總要花費許多的時間來決定挖掘過程中,萃取出關聯規則所需的最小支持度與最小信心度門檻值的設定方式,故如何快速且客觀地設定最小門檻值來得到有意義的規則便相當重要。 本論文中將提出一挖掘關聯規則的新演算法,其透過暫存支持度與資料叢集索引的概念來挖掘二元型態交易資料的關聯規則。同時藉由權重項目法的概念來反映不同項目間的重要性,並將之應用在啟發式方法-基因演算法中適應度函數與目標規則函數上。透過基因演算法演算速度快且屬於廣域搜尋的特性,及目標規則函數與權重值為考量,來評估不同關聯規則間的價值,進而產生一客觀的最小門檻值設定方式建議,以期能快速地挖掘出重要且有意義的關聯規則,供企業管理者進行決策分析。 為了驗證本研究所提出之方法為一有效之關聯規則挖掘法,本研究透過數個範例資料,以及一實際的信用卡消費案例資料之應用分析,探討在實際應用上對關聯規則產生結果的影響。由此些應用中可知,將權重項目法應用在基因演算法之適應度函數與目標規則函數的方式,確實能有效的評估規則間的價值與重要性,且藉由基因演化應用確實能快速且客觀地提供一合適的最小門檻值設定建議,來提昇挖掘關聯規則的品質與效率。因此,由上述應用結果可證明本研究所提出的方法,應用在挖掘關聯規則上確實具有其實用性。

並列摘要


Association rule is one of the most important and useful technologies in data mining applications. Association rule technologies extract unknown information from large database and summarize meaningful relation among items to help business managers make better decision. Currently, most of the technologies are focused on basket analysis in supermarkets. Although the usage of the technologies does improve the understanding of item relationships, most of these works consider only buying or not-buying behavior. They did not consider important item properties such as profits to raise the quality of generated association rules. In addition, deciding suitable threshold values of support and confidence is critical to the quality of association rule technology. However, there is few researches focus on how to decide the threshold values. In this thesis, a new association rule algorithm is introduced to solve the above limitations. In this algorithm we use the concepts of temporary support and data index to represent association rules that are transformed into binary formats. To emphases other important item properties, this research uses the weighted items to represent the importance of individual items. These weighted items are used into the fitness function of heuristic genetic algorithms (GA) to estimate the value of different rules. The genetic algorithms can generate suitable threshold values for association rule mining. The method proposed in this thesis is successfully applied to several retailing transaction databases and one real-world credit card database. These applications show that weighted items apply in fitness function of a genetic algorithm can estimate the value of association rules efficiently. It is also found that genetic algorithms can really suggest suitable threshold values to get quality rules. These results demonstrate that the proposed algorithm is a practical method for increasing the quality of generated association rules.

參考文獻


13.陳振東、戴偉勝,「網際網路環境中個人化資訊推薦系統實作之研究」,中華民國資訊管理學報,第九卷,第一期,第22-39頁
11.郭泯旬,「關聯規則最小支持度之研究--以零售業為例」,元智大學工業工程與管理學系,碩士論文,2000
12.陳仕昇,「以可重複序列挖掘網路瀏覽規則之研究」,國立中央大學資訊管理學系,碩士論文,1998
17.Goulbourne, G; F. Coenen and P. Leng, “Algorithms for computing association rules using a partial-support tree,” Knowledge-Based Systems Volume: 13, Issue: 2-3, April, 2000, Page(s): 141-149.
19.Hong, T.P.; C.S. Kuo and S.C. Chi, “Mining association rules from quantitative data,” Intelligent Data Analysis Volume: 3, Issue: 5, November. 1999, Page(s): 363-376.

被引用紀錄


蔡旻宏(2004)。資料挖掘網路服務系統之探討〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-0112200611320899
黃博聞(2004)。發展一個關聯分群方法於多物項存貨管理〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-0112200611314735
邱宇婷(2006)。應用粒子群最佳化演算法於關聯法則探勘之研究〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-0307200616581000
劉冠緯(2007)。應用時間加權於可重複序列之研究-以預測線上顧客消費狀態為例〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-2106200716302600
羅志忠(2010)。利用概念地圖之建立於試題編製優劣之研究─以國小中年級分數概念為例〔碩士論文,國立虎尾科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0028-2207201018123800

延伸閱讀