跨多條資料串流之漸進式循序樣式探勘

循序樣式探勘目的是發掘出頻繁的序列樣式。當循序樣式被發掘出來，後來新到達的樣式可能會因為已存在的這些循序樣式而不被認為是頻繁的序列樣式。漸進式循序樣式探勘的目的就是當序列資料不斷加入，發掘出目前最新的循序樣式，而過時的循序樣式則會被刪除。當序列資料從多個資料串流同時進入時，要維護和更新頻繁的序列樣式就會變得更加困難，更糟的是，當我們考慮到跨多個資料串流的序列，過去被提出的方法就不能有效的探勘頻繁的序列樣式。在本論文中，我們提出一個有效率的PAMS演算法來解決這些問題。PAMS使用PSM樹的資料型態來插入新的項目、更新當前項目、並刪除過時的項目。實驗結果顯示，PAMS在跨多個資料串流的漸進式序列探勘上有顯著的優於其它過去被提出的方法。

關鍵字

漸進式探勘；循序樣式；跨多資料串流

並列摘要

Sequential pattern mining is to find frequent data sequences with time. When sequential patterns are generated, the newly arriving patterns may not be identified as frequent sequential patterns due to the existence of old data and sequences. Progressive sequential pattern mining aims to find most up-to-date sequential patterns given that obsolete items will be deleted from the sequences. When sequences come with multiple data streams, it is difficult to maintain and update the current sequential patterns. Even worse, when we consider the sequences across multiple streams, previous methods could not efficiently compute the frequent sequential patterns. In this work, we propose an efficient algorithm PAMS to address this problem. PAMS uses a PSM-tree to insert new items, update current items, and delete obsolete items. The experimental results show that PAMS significantly outperforms previous algorithms for mining progressive sequential patterns across multiple streams.

並列關鍵字

Progressive Mining ； Sequential Pattern ； Multiple Data Streams

參考文獻

[1] R. Agrawal and R. Srikant. “Mining sequential patterns”. In Proceedings of the Eleventh International Conference on Data Engineering, pages 3–14, March 1995.

[3] G. Chen, X. Wu, and X. Zhu. “Sequential pattern mining in multiple streams”. In Fifth IEEE International Conference on Data Mining, pages 585–588, November 2005.

[5] Chin-Chuan Ho, Hua-Fu Li, Fang-Fei Kuo, and Suh-Yin Lee. “Incremental mining of sequential patterns over a stream sliding window”. In Sixth IEEE International Conference on Data Mining Workshops, pages 677 –681, December 2006.

[6] Jen-Wei Huang, Chi-Yao Tseng, Jian-Chih Ou, and Ming-Syan Chen. “A general model for sequential pattern mining with a progressive database”. IEEE Transactions on Knowledge and Data Engineering, 20:1153 –1167, 2008.

[10] S.-Y. Yang, C.-M. Chao, P.-Z. Chen, and C.-H. Sun. “Incremental mining of across-streams sequential patterns in multiple data streams”. Journal of Computers, 6(3):449–457, March 2011.

被引用紀錄

高秀娥（2007）。影響呼吸器依賴病患家屬選擇呼吸照護病房因素及滿意度調查〔碩士論文，臺北醫學大學〕。華藝線上圖書館。https://doi.org/10.6831/TMU.2007.00057

國際替代計量

跨多條資料串流之漸進式循序樣式探勘

全文下載

主題瀏覽