透過您的圖書館登入
IP:3.144.189.177
  • 期刊

結合動態分群人造蜂群演算法之自動化分群系統

An Automatic Clustering System with Dynamic Clustering Artificial Bee Colony Algorithm

摘要


分群是一種資料探勘技術,它是一種非監督式的學習方法,透過相似度計算,將資料分成不同的群。在分群演算法中,啟發式分群在近年來漸漸受到重視,它指的是運用啟發式演算法或啟發式的概念解決分群問題。相較於目前的一些其方分群方法(如:k-means),啟發式分群似有較好的表現。一般的分群演算法在實作時,通常需要使用者給予額外的資訊(例如:群數),這些資訊有時候使用者並不容易下決定,因此,若是能讓分群演算法自動決定群數進行分群,對於使用者來說將會更加的便利。鑑於啟發式分群在分群問題的成功,開始有學者嘗試設計可以自動化分群的啟發式分群演算法,像是衍生自基因演算法(genetic algorithm; GA)的GCUK(genetic clustering for unknown K),以及衍生自粒子群最佳化演算法(particle swarm optimization; PSO)的MEPSO(Multi-Elitist PSO)。上述這兩種方法雖然可以成功的自動化分群,但卻有著效率不佳的問題,其原因有二:(1)編碼格式設計不佳導致演算法需搜尋之解空間過大,(2)選用之啟發式演算法不一定適合分群問題或其能力不足。因此,本研究提出一個可以自動化決定群數的自動化分群系統(automatic clustering system; ACS),此系統先使用群數搜尋演算法(cluster range discovery algorithm; CRD)縮減欲搜尋的群數區間,再使用動態分群人造蜂群演算法(dynamic clustering artificial bee colony algorithm; DCABC)進行自動化分群,DCABC加入了模範策略(model strategy)以克服既有人造蜂群演算法(artificial bee colony algorithm; ABC)的缺點,並透過特別設計的編碼格式,使其可以在分群的時候同時達成決定群數和優化分群品質的功能。

並列摘要


Purpose-This study designs a system that automatically determines the number of groups of clustering. This study refers to the research of past heuristic algorithms and heuristic grouping, and improves the weakness of ABC algorithm, and proposes an exemplary strategy to improve the search performance of the algorithm. Design/methodology/approach - This study designs an Automatic Clustering System (ACS). ACS uses a Cluster Range Discovery (CRD) algorithm to reduce the search range of cluster number. After that, the ACS uses Dynamic Clustering Artificial Bee Colony algorithm (DCABC) to complete the automatic clustering. DCABC adopts the Model Strategy to overcome the drawback of original artificial bee colony algorithm (ABC). DCABC also designs a brand-new encoding format. Combining this encoding format, DCABC can cluster the data and find the number of clusters simultaneously. Findings-With the success of meta-heuristic clustering, some researchers tend to design an automatic clustering algorithm with meta-heuristic method. The experiment results show the proposed DCABC can find the suitable cluster number and can have better performance than ABC. Research limitations/implications-Although the algorithm proposed in this study can automatically determine the appropriate number of groups, at the time of initialization, the user must specify the group number interval. Practical implications-This study proposes an Automatic Clustering System by using a Cluster Range Discovery algorithm to reduce the search range of cluster number. The ACS uses Dynamic Clustering Artificial Bee Colony algorithm to complete the automatic clustering. DCABC adopts the Model Strategy to overcome the drawback of original artificial bee colony algorithm. Originality/value-Unlike the similar algorithms which needs to assign number of cluster manually. This study proposes an Automatic Clustering System (ACS), which can automatically determine the number of clusters. Combining the designed encoding format, DCABC can cluster the data and find the number of clusters simultaneously.

參考文獻


Yan, Y., Zhang Y. and Gao, F. (2012), ‘Dynamic artificial bee colony algorithm for multi-parameters optimization of support vector machine-based soft-margin classifier’, EURASIP Journal on Advances in Signal Processing, Vol. 2012: 160. https://doi.org/10.1186/1687-6180-2012-160
Akay, B. and Karaboga, D. (2012), ‘A modified Artificial Bee Colony algorithm for real-parameter optimization’, Information Sciences, Vol. 192, No. 1, pp. 120-142.
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J. and Perona, I. (2013). ‘An extensive comparative study of cluster validity indices’, Pattern Recognition, Vol. 46, N o.1, pp. 243-256.
Bandyopadhyay, S. and Maulik, U. (2002), ‘Genetic clustering for automatic evolution of clusters and application to image classification’, Pattern Recognition, Vol. 35, NO. 6, pp. 1197-1208.
Chatterjee, S., Carrera, C. and Lynch, L.A. (1996), ‘Genetic algorithms and traveling salesman problems’, European Journal of Operational Research, Vol. 93, No. 3, pp. 490-510.

延伸閱讀