  • 學位論文


The Study of Cluster Ensemble

指導教授 : 林志麟


過去,研究發現叢集整合技術(Cluster Ensemble)能有效提高叢集演算法的穩健性和穩定性。本文透過實驗比較個別叢集演算法與叢集整合演算法之叢集結果。本文使用K-means與 Fuzzy C-means兩種個別叢集演算法以及使用證據累積(Evidence Accumulation, EA)作為叢集整合的方法。經過不同資料集的實驗測試後,發現屬性個數較多之資料集宜採用叢集整合演算法,屬性個數較少之資料集則宜直接執行個別叢集演算法許多次後取最佳的結果。


Recent studies have shown that cluster ensemble improves the robustness and stability of individual clustering algorithms. This paper compares the clustering results of individual clustering algorithms and of cluster ensemble algorithm. K-means and Fuzzy C-means are used as individual clustering algorithms, and their results are combined using evidence accumulation for cluster ensemble algorithm. Our experimental results with several datasets show that, for datasets with many features, cluster ensemble algorithms are more suitable than individual clustering algorithms, but for datasets with few features, individual clustering algorithms are better.


[4] Strehl, A. and Ghosh, J. (2003), Cluster Ensembles:A Knowledge Reuse Framework for Combining Multiple Partitions, Journal of Machine Learning Research, 3, pp. 583-617.
[2] Jain, A. K., Murty, M. N. and Flynn, P. J. (1999), Data clustering:A review, ACM Computing Surveys, 31(3), pp. 264-323.
[3] Fred, A. L. N. (2001), Finding Consistent Clusters in Data Partitions, Lecture Notes in Computer Science, 2096, pp. 309-318.
[5] MacQueen, J. B. (1967), Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1, pp. 281-297.
[6] Dunn, J. C. (1973), A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-separated Clusters, Journal of Cybernetics, 3 (3), pp. 32-57.

