透過您的圖書館登入
IP:3.135.184.109
  • 期刊

A Nonparametric Multi-Seed Data Clustering Technique

非參數式資料群集法

摘要


單一群集中心點無法處理細長形狀的資料分佈;所以,當資料分佈形成複雜形狀,需要將之分割成數個小群集,並將這些小群集合併為一群,因而需要多個小群集的中心點,作為最終單一群集的起始參考點。本研究提出一非參數式的資料群集法,藉由分割與合併的程序來處理複雜形狀的資料分佈;在分割程序中,應用基因演算法將資料區分為數個小群集,並找出最適宜的群集中心點;而後,應用本研究所發展一種嶄新的判斷演算法-採用最小展開樹與統計方法,判斷任何鄰近的小群集是否合併為單一群集。最終,本文藉由數種資料分佈與實際資料,驗證本群集法的有效性。

並列摘要


Clustering of data around one seed does not work well if the shape of the cluster is elongated or non-convex. A complex shaped cluster requires several seeds. This study developed a nonparametric multi-seed data clustering approach which splits and merges procedures to handle the complex shapes of clusters. The splitting process utilizes a genetic algorithm to search for the appropriate cluster centers, which split all data into a considered amount of groups. To assign several seeds into one cluster, an innovative clustering process using a minimal spanning tree and statistics concept was proposed to judge whether a pair of clusters should be merged or separated. Experimental results illustrate the difficulties of one-seed-per-cluster, and also the effectiveness of the proposed clustering scheme.

參考文獻


Tseng, L.Y.,S. B. Yang(2000).A genetic approach to the automatic clustering problem.Pattern Recognition.33,1251-1259.
Jain, A. K.,M. N. Murty,P. J. Flyn(1999).Data clustering: a review.ACM Computing Surveys.31,264-323.
Bandyopadhyay, S.,U. Maulik(2001).Nonparametric genetic clustering : comparison of validity indices.IEEE Transactions on System, Man, Cybernetics, Part C.31,120-125.
Chiou, Y. C.,L. W. Lan(2001).Genetic clustering algorithms.European Journal of Operational Research.135,413-427.
Bandyopadhyay, S.,U. Maulik(2002).Genetic clustering for automatic evolution of clusters and application to image classification.Pattern Recognition.35,1197-1208.

延伸閱讀