長時間序列叢集化之研究

時間序列分析可以讓我們了解實際事件變化與行為，而針對時間序列進行相似性叢集有助於未來行為的預測以及規劃。目前大部分資料探勘的應用多針對客戶基本資料進行分析，只單就基本資料進行資料探勘，所得到的分析結果不夠深入。若能結合基本資料與時間序列資料，所賦予的企業意義會更深入完整。因此針對時間序列叢集模式，我們提出多客戶、單一客戶兩類叢集模式。多客戶叢集模式目的在於區分出不同特性的客戶群，而單一客戶分群的目的在於針對某一特定客戶，找出此客戶隨著不同時間行為表現的特性與型樣(pattern)。在時間序列叢集方面，由於時間序列具有高維度性質，資料量十分龐大，如果直接針對原始序列加以叢集，計算成本高而且所得到的叢集結果往往不佳，因此如何加速叢集處理的效率以及改善叢集結果的品質為本論文的重點

關鍵字

資料探勘；時間序列；叢集分析

並列摘要

Time Series are ubiquitous in our world. The thing can change with the time, we can call it time series. We can understand the behavior by time series analysis. And the aim of clustering time series is to discover repetitive patterns and trends, and to predicting and planning the future. Most data mining applications only focus on mining customer personal data. If we can combine the customer personal data with time series data, we can obtain more complete business knowledge. In this paper, we propose the multi-customer and single-customer cluster mode. The purpose of multi-customer cluster mode is to clustering the different customer group. And the purpose of single-customer cluster mode is to find the characteristic of specific customer at different time. In time series clustering, because of the time series data is high dimensional, the data amount will be huge. If we cluster the original series directly, the calculation cost will be high and can’t obtain good cluster result. The key point of this research is to improve the efficiency of the time series processing speed and the cluster quality.

並列關鍵字

HASH(0xe7db4e8)

參考文獻

[ALSS95] R. Agrawal, K.I. Lin, H.S. Sawhney, and K. Shim, “Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases”, In Proc. of the 21st VLDB Conference, pp. 490-501, 1995.

[CG90] Gail A. Carpenter and Stephen Grossberg, “ART3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures”. Neural Networks 3, pp. 129—152, 1990.

[Chatfield96] C. Chatfield,” The Analysis of Time Series”, Chapman and Hall, 1996, 5th edition.

[CM99] K. Chakrabarti and S. Mehrotra, “The hybrid tree: An index structure for high dimensional feature spaces”, In Proc. of 15th International Conference on Data Engineering, pp 440—447, 1999.

[CF99] K.-P. Chan and A.-C. Fu, “Efficient Time Series Matching by Wavelets”, In Proc of 15th International Conference on Data Engineering , pp.126-133,1999.

國際替代計量

長時間序列叢集化之研究

主題瀏覽