透過您的圖書館登入
IP:216.73.216.250
  • 期刊

A Framework for Discovering Variable-length Motifs in Medical Data Streams

摘要


In this paper, we explore two key problems in time series motif discovery: releasing the constraints of trivial matching between subsequence with different lengths and improving the time and space efficiency. The purpose of avoiding trivial matching is to avoid too much repetition between subsequence in calculating their similarities. We describe a limited-length enhanced suffix array based framework (LiSAM) to resolve the two problems. We first convert the continuous time series to the discrete time series using the Symbolic Aggregate approXimation procedure, and then introduce two covering relations of the discrete subsequence: α-covering between the instances of LCP (Longest Common Prefix) intervals and β-covering between LCP intervals to support the motif discovery: if an LCP interval is βuncovered, its instances form a motif. The βUncover algorithm of LiSAM identifies the β-uncovered l-intervals, in which we introduce two LCP tabs: presuf and nextsuf to support the identification of the α-uncovered instances of an l-interval. Experimental results on Electrocardiogram signals indicate the accuracy of LiSAM on finding motifs with different lengths.

延伸閱讀