使用以限制為基礎的序列規則方法的顧客購買行為研究

序列資料挖掘是一種在資料挖掘領域中非常重要的一種方法，其目標是從序列資料庫中，找出與時間相關的行為樣式。近幾年來，用序列資料挖掘方法來找出有用的資訊已被應用到各種不同的應用領域，例如：行銷決策、醫療紀錄分析、銷售分析等。過去大多數的序列資料挖掘方法都只注重在序列樣式頻率上的探討，主要的原因在於過去在做序列資料分析均假設序列資料並不會隨著時間而有所變動。然而，在現實生活上企業銷售的資料卻是具有高度的變動性與複雜性的，所以這導致了序列行為會經常隨著時間而有所變動。針對這個問題，我們在本文中將之分為兩個子問題：「企業對企業(B2B)環境下的序列資料挖掘」與「企業對顧客(B2C)環境下的序列資料挖掘」，而分為這兩個子問題來做後續探討的主要原因在於其序列資料具有各自的特色。緊接著我們介紹三種新的概念：考量新穎性(Recency)、考量重覆性(Repetition)、與考量簡潔性(Compactness)。新穎性的概念在於讓所產生的序列樣式可以考量到最近發生的行為，重覆性的概念可以確保序列樣式在一個序列中最少出現的次數必須滿足使用者的要求，而簡潔性的概念則確保一個序列樣式是在使用者自訂的一個時間區間下所發生。在本文中我們針對兩種不同的環境，運用了上述的三種概念來定義了兩種獨特的序列樣式，同時並發展出兩套有效率的演算法。我們也進行非常完整的實驗評估，結果顯示本文所提出的兩種演算法不但非常的有效率，且當序列資料在高度變動下，相對於傳統方法我們可以找出更有趣的序列樣式。

關鍵字

序列資料；以限制為基礎的資料挖掘方法；時間序列資料庫

並列摘要

Sequential pattern mining is an important data-mining method for determining time-related behavior in sequence databases. The information obtained from sequential pattern mining can be used in marketing, medical records, sales analysis, and so on. Existing methods only focus on the concept of frequency because of the assumption that sequences’ behaviors do not change over time. Business sales environments are always highly dynamic and complicated, however, so the sequences’ behaviors may change over time. In this study, we first divide this problem into two sub-problems: sequential pattern mining in business-to-business (B2B) environment and business-to-customer (B2C) environment due to their unique sequence characteristics. Then, three new concepts, recency, repetition, and compactness, are incorporated into traditional sequential pattern mining to discover meaningful patterns in these two environments. The concept of recency causes patterns to quickly adapt to the latest behaviors in sequence databases. The concept of repetition ensures the occurrences of a pattern in a data-sequence must exceed user-specified thresholds. The concept of compactness ensures reasonable time spans for the discovered patterns. Two new patterns as well as efficient algorithms are presented in this dissertation. Thorough empirical evaluations are also given. The results show that the proposed methods are computationally efficient and they are more advantageous than traditional methods when sequences’ behaviors change over time.

並列關鍵字

Constraint-based mining ； temporal database ； Sequential pattern

參考文獻

[2] M. Last, Y. Klein, and A. Kandel, “Knowledge Discovery in Time Series Databases”, IEEE transactions on systems, man, and cybernetics, Vol. 31, No. 1, pp. 160-168, 2001.

[3] B. LeBaron and A. S. Weigend, “A Bootstrap Evaluation of the Effect of Data Splitting on Financial Time Series”, IEEE Transactions on Neural Networks, Vol. 9, No. 1, pp. 213-220, 1998.

[4] C. Y. Chang, M. S. Chen, and C. H. Lee, “Mining general temporal association rules for items with different exhibition periods”, IEEE International Conference on Data Mining, pp. 59-66, 2002.

[5] C. H. Lee, M. S. Chen, and C. R. Lin, “Progressive partition miner: an efficient algorithm for mining general temporal association rules”, IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 4, pp. 1004-1017, 2003.

[6] Y. Li, P. Ning, X. S. Wang, and S. Jajodia, “Discovering calendar-based temporal association rules”, Data & Knowledge Engineering, Vol. 44, No. 2, pp. 193-218, 2003.

被引用紀錄

劉瑞祥（2011）。設計者想像力生成之歷程初探〔碩士論文，國立臺北科技大學〕。華藝線上圖書館。https://doi.org/10.6841/NTUT.2011.00587

陳蓉慧（2015）。基於網頁點擊之序列探勘的交易推薦機制〔碩士論文，朝陽科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0078-2502201617123989

國際替代計量

使用以限制為基礎的序列規則方法的顧客購買行為研究

未授權

主題瀏覽