透過您的圖書館登入
IP:3.144.212.145
  • 期刊
  • OpenAccess

Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning

並列摘要


Sequential pattern mining is a challenging issue because of the high complexity of temporal pattern discovering from numerous sequences. Current mining approaches either require frequent database scanning or the generation of several intermediate databases. As databases may fit into the ever-increasing main memory, efficient memory-based discovery of sequential patterns is becoming possible. In this paper, we propose a memory indexing approach for fast sequential pattern mining, named MEMISP. During the whole process, MEMISP scans the sequence database only once to read data sequences into memory. The find-then-index technique is recursively used to find the items that constitute a frequent sequence and constructs a compact index set which indicates the set of data sequences for further exploration. As a result of effective index advancing, fewer and shorter data sequences need to be processed in MEMISP as the discovered patterns get longer. Moreover, we can estimate the maximum size of the total memory required, which is independent of the minimum support threshold, in MEMISP. Experimental results indicate that MEMISP outperforms both GSP and PrefixSpan (general version) without the need for either candidate generation or database projection. When the database is too large to fit into memory in a batch, we partition the database, mine patterns in each partition, and validate the true patterns in the second pass of database scanning. Experiments performed on extra-large databases demonstrate the good performance and scalability of MEMISP, even with very low minimum support. Therefore, MEMISP can efficiently mine sequence databases of any size, for any minimum support values.

被引用紀錄


Chen, S. P. (2010). 重複購買行為分析 [doctoral dissertation, Tamkang University]. Airiti Library. https://doi.org/10.6846/TKU.2010.01110
Chen, Y. C. (2012). 探勘時間間隔循序特徵樣式之相關研究 [doctoral dissertation, National Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2012.00302

延伸閱讀