傳統K-means方法需要預先設定組數才能進行資料分群,為解決此問題,本研究以粒子最佳化演算法(Particle Swarm Optimization)為基礎,發展出一套動態資料分群方法,用以處理無法預先知道組數之分群問題。本研究提出之方法,結合了K-means及組合式粒子最佳化演算法,我們稱之為K-means with Combinatorial Particle Swarm Optimization (KCPSO),在演算法開始前,先給予一個最大分群組數之參數,使用組合式粒子最佳化演算法在此最大分群組數下,透過分群效度指標(Cluster Validity Index)的衡量,調整各粒子的分群組數,接著利用粒子最佳化演算法之記憶與分享資訊的能力來選取群中心,並利用K-means調整群中心的位置,如此能改善初始群中心對K-means的影響,並找出適當的分群組數與分群結果。本演算法已被開發成系統,並透過數種資料分群題目進行驗證,實驗結果顯示,相較於其他類似演算法,KCPSO能更快速且有效的分群。
This paper presents a dynamic data clustering algorithm named K-means with Combinatorial Particle Swarm Optimization (KCPSO). Unlike the K-means method, KCPSO does not need a specific number of clusters before clustering is performed and is able to find the proper number of clusters automatically. A predefined parameter of maximum cluster number is given, and a cluster validity index is employed to evaluate the clustering results in order to adjust the cluster number of each particle. Then, the cluster center among particles is adjusted by using K-means. KCPSO is able not only to avoid the drawback of K-means but also to determine the proper number of cluster. KCPSO has been developed into a system and evaluated by testing some datasets. Results show that KCPSO is an effective algorithm in providing the optimal number of clusters.