時空資料庫中封閉性數值樣式之資料探勘

時空資料庫中樣式的資料探勘，可以幫助我們了解地理位置上分布不同的目標物或是事件的連續變化趨勢。因此，在這篇論文中，我們提出一個有效率的探勘演算法叫「STP-Mine」，用來找尋時空資料庫中的封閉性數值樣式。演算法主要可分為三個階段。第一階段，我們產生出所有長度為1的頻繁樣式及其映射資料庫。第二階段，我們利用頻繁樣式樹在空間維度上以深先搜尋法的方式遞迴產生所有的頻繁樣式。第三階段，我們利用頻繁樣式樹在時間維度上以深先搜尋法的方式遞迴產生所有的頻繁樣式。第二、第三階段的步驟將不斷地遞迴進行，直到沒有頻繁時空樣式被產生為止。在探勘的過程中，我們利用一些有效的修剪策略，以避免產生不必要的候選樣式，並檢查所產生的樣式是否為封閉的樣式。實驗結果顯示，不論在合成資料或真實資料中，我們所提出的方法皆優於改良式的A-Close演算法。

關鍵字

資料探勘；時空資料庫；封閉性樣式；頻繁樣式

並列摘要

Mining spatial-temporal patterns can help us retrieve valuable and implicit information from an abundance of spatial-temporal data in a database. In this thesis, we propose a novel algorithm, STP-Mine (Spatial- Temporal Patterns-Mine), to mine closed stpatterns in a spatial-temporal database. The proposed algorithm consists of three phases. First, we find all frequent length-1 patterns (1-patterns) and construct a projected database for each frequent 1-pattern found. Second, we recursively generate frequent super-patterns in the spatial dimension in a depth-first search manner. Third, once a pattern cannot grow further in the spatial dimension, we extend it in the temporal dimension in a depth-first search manner. The steps in the second and third phases are repeated until no more frequent closed patterns can be found. During the mining process, we employ several effective pruning strategies to prune unnecessary candidates and a closure checking scheme to remove non-closed stpatterns. The experimental results show the STP-Mine algorithm is efficient and scalable, and outperforms the modified A-Close algorithm in one order of magnitude.

並列關鍵字

data mining ； spatial-temporal database ； closed pattern ； frequent pattern

參考文獻

[1] R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Proceedings of the 20th Very Large Data Base Conference, 1994, pp. 487-99.

[3] H. Cao, N. Mamoulis, D.W. Cheung, Mining frequent spatio-temporal sequential patterns, Proceedings of the International Conference on Data Mining, 2005, pp. 82-89.

[5] K. Koperski, J. Han, Discovery of spatial association rules in geographic information databases, Proceeding of the International Symposium on Large Spatial Databases, 1995, pp. 47-66.

[11] J. Pei, J. Han, B. Mortazavi-Asl, Q. Chen, U. Dayal, M. C. Hsu, PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth, Proceedings of the IEEE International Conference on Data Engineering, 2001, pp. 215-224.

[14] X. Yan, J. Han, and R. Afshar, CloSpan: Mining closed sequential patterns in large datasets, Proceedings of the 2003 SIAM International Conference on Data Mining, 2003, pp. 166-177.

國際替代計量

時空資料庫中封閉性數值樣式之資料探勘

全文下載

主題瀏覽