透過您的圖書館登入
IP:18.116.8.110
  • 學位論文

以價值為基礎之資料探勘

The Development of Profit-Based Data Mining Techniques

指導教授 : 趙莊敏 陳穆臻
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


隨著資訊科技不斷的進步,使得企業在收集與儲存資料更加便利。為了解決傳統資料分析工具和技術無法有效率的處理大量資料,資料探勘技術即逐漸的被廣泛應用,其目的為從大量的資料中萃取或挖掘有趣的樣式或規則。在關聯分析的應用領域,最典型的例子為購物籃分析,透過購物籃分析可以找出顧客交易行為中商品品項之關聯性,以幫助決策者瞭解顧客的購買習性,進而將此資訊應用在市場行銷、貨架擺設、產品組合銷售與交叉銷售之決策依據。支持度與信賴度常被用於探勘關聯法則,然而,僅以最小支持度與最小信賴度篩選規則,將會產生大量的規則,其中包括為數不少之不具價值的規則;或者可能會忽略具有價值的規則。因此,如何能夠挖掘同時考量客觀衡量準則 (例如,事件發生機率) 與主觀衡量準則 (例如,利潤) 之樣式或規則,為資料探勘技術的挑戰與企業欲瞭解的議題。本研究強調以價值為基礎進行資料探勘程序中的樣式評估。應用資料包絡分析法以多重準則計算關聯法則之效率值,依照重要性將規則排序以便挑選適當之規則。並且,於計算關聯法則效率值後,以規則具有相對效率與否為類別變數,規則屬性為輸入變數,透過決策樹建構分類模式,尋找相對有效率規則之特徵,以及規則重要屬性,並可利用產生之分類模式預測新的關聯法則是否相對有效率 (是否有趣)。另外,由於企業資源與行銷預算有限,因此,如何挑選有價值規則是非常重要的。本研究運用最佳化方法,設計二個數學規劃模式,選取最具有價值之規則組合,提供行銷決策者執行相關行銷活動之參考,以增加企業行銷利潤。

並列摘要


With the continuous growth of information technology, it is more convenient to collect and store data for enterprises. In order to overcome the shortcoming of the traditional data analysis tools and techniques that have difficulty to deal with the massive size of dataset, data mining techniques recently have developed to address this issue. Data mining aims at discovering interesting patterns or rules from a large amount of data. Market basket analysis is one of the typical applications in mining association rules. It can find out the relationship of items form a large amount of transaction data, and can help marketing analysts to learn the purchasing behaviors of customers. The valuable information discovered from data mining could be used to support decision making such as marketing promotions, shelf space management, bundle selling, cross-selling, etc. Generally, support and confidence (objective measures) are used to measure the interestingness of association rules. By using these two measures, it may generate a large set of rules, most of them may be not valuable (not interesting). It is difficult to discover the association rules of interestingness by only setting constraints of minimum support and minimum confidence. Therefore, how to discover the patterns by considering both objective measures (e.g., probability) and subjective measures (e.g., profit) is a challenge in data mining, particularly in business application. This thesis focuses on pattern evaluation in the process of knowledge discovery by using the concept of profit mining. Data Envelopment Analysis (DEA) is utilized to calculate and rank the efficiency of association rules with multiple criteria. After evaluating the efficiency of association rules, they are categorized into two class- relatively efficient (interesting) and relatively inefficient (uninteresting). Decision Tree (DT) based classifier is built by using rule attributes to classify these two classes. Furthermore, the rules generated by DT are used to find out the characteristics of rule interestingness. The constructed classifier can be used to classify the unknown (new) association rules as well. In addition, how to select profitable rules for marketing is an important issue. This thesis additionally constructs two mathematical models in order to choose the optimal set of association rules under the objective of maximum revenue, and subject to a set of constraints. It can help marketing analysts design the promotion campaigns and increase the profit of marketing.

參考文獻


[1] Agrawal, R., and Srikank, R., 1994, Fast Algorithm for Mining Association Rules, The International Conference on Very Large DataBases, 487-499.
[2] Agrawal, R., Imielinski, T., and Swami, A., 1993, Mining Association Rules between Sets of Itms in Large Databases, Proceedings of the ACM SIGMOD Conference on Management of Data, 254-259.
[4] Bose, I., and Mahapatra, R. K., 2001, Business data mining-a machine learning perspective, Information & Management, 39(3), 211-225.
[5] Chen, M. C., 2007, Ranking discovered rules from data mining with multiple criteria by data envelopment analysis, Expert Systems with Applications, 33(4), 1110-1116.
[6] Chen, M. C., Chiu, A. L., and Chang, H. H., 2005, Mining changes in customer behavior in retail marketing, Expert Systems with Applications, 28(4), 773-781.

延伸閱讀