透過您的圖書館登入
IP:3.145.74.54
  • 學位論文

考量時間機率之循序樣式探勘方法

A Sequential Pattern Mining Approach Considering Time Probability

指導教授 : 徐煥智

摘要


循序樣式探勘是一門研究如何從序列資料庫裡找出頻繁循序樣式的資料探勘方法,過去的序列資料探勘方法大致可分成兩大類[2]:Apriori-like methods [7][8][9][23]和Pattern-growth methods [10][12][16][17][20]。在一個循序樣式中,兩事件間隔發生的時間機率可以提供更多資訊給決策者分析與預測關聯樣式的變化。然而先前的研究並沒有發展出能於探勘樣式的過程裡同時找出此機率的技術。因此,為了提供這樣的資訊,我們擴充PrefixSpan演算法且發展成一套新的演算法PCTP(PrefixSpan Considering Time Probability)。此方法也可以藉由考量最小機率的限制,來減少探勘過程中產生的樣式數量。   本研究以實驗來比較PCTP與現存的循序樣式探勘方法,結果顯示PCTP可彌補過去相關方法的不足。由效能研究中亦証明-PCTP是一個能精減關聯樣式並可為循序樣式提供額外時間機率資訊的有效方法。

並列摘要


Sequential Pattern Mining is a data mining method that is used to find frequent sequential patterns in a sequential database. The conventional sequence data mining methods can be divided into two categories[2]: Apriori-like methods [7][8][9][23] and Pattern-growth methods[10][12][16][17][20]. Time-interval probability between two events in a sequential pattern can provide more information for decision maker to analyze and predict the behavior of correlated pattern. However, in the previous studies there is no technique developed to simultaneously discover the probability in the pattern mining process. Thus, to provide such information, we extend the PrefixSpan method and develop a new sequential pattern mining approach, PCTP(PrefixSpan Considering Time Probability). The proposed approach can also reduce the number of patterns produced in the mining process by considering the minimize probability constraint. The proposed approach is compared to existing sequential pattern mining methods to show how they complement each other to discover association rules. Our performance study shows that PCTP is a valuable approach to condense the correlated patterns and provide additional time-interval probability information for sequential pattern.

參考文獻


[3] 張昭憲、郭家君,「應用於分散式系統之平行循序樣本探勘」,淡江大學資訊管理所碩士論文,June 2007。
[4] 楊燕珠、陳仕昇、林哲民,「利用多重支持度探勘部分週期性樣式」,大同大學資訊經營所碩士論文,June 2005。
[9] Agrawal, R., Srikant, R., “Mining Sequential Patterns: Generalizations and Performance Improvements,” In 5th Int. Conf. Extending Database Technology, March 1996.
[10] Chen, Y.L., Chiang, M.C. and Ko, M.T., “Discovering time-interval sequential patterns in sequence databases,” Expert Systems with Applications, Vol. 25, No. 3, 2003, pp. 343-354.
[11] Chen, Y.L., Chen, S.S., and Hsu, P.Y., “Mining hybrid sequential patterns and sequential rules,” Information Systems, Vol. 27, No.5, 2004, pp. 345-362.

被引用紀錄


林師晟(2010)。具有時間限制條件的最長頻繁循序樣式探勘演算法〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2010.00835

延伸閱讀