資料分群(Data Clustering)是資料採礦領域重要的研究課題,也是一種相當重要的資料分析技術。由於資料分群問題屬於NP-Hard的問題,所以有許多學者應用巨集啟發法則求解該種問題。Shelokar首先應用螞蟻最佳化演算法(Ant Colony Optimization, ACO)來發展資料分群方法,但是該方法並未使用啟發資訊,所以須要較多的螞蟻數目及較長的時間才能得到較佳的解。本研究改良其方法,將啟發資訊加入於狀態移轉機率模式,期望能提升資料分群之能力。 本研究所提出的演算法,主要是利用資料點與各個群中心的距離作為啟發資訊,來加快個別螞蟻的分群速度,同時以費洛蒙的長期學習機制來避免落入區域最佳解,因此可以有效解決分群問題。其分群過程是讓螞蟻先隨機挑選資料點,再依機率模式選擇群組,而個別螞蟻所形成的群中心會隨著分群過程逐漸調整;當分群過程結束時,最終之群中心也隨之形成。本研究利用數個典型資料分群題目進行驗證,結果顯示本研究所提出的演算法確實優於Shelokar的演算法。最後本研究將演算法應用於求解單元形成問題,以證實其在實務應用面之能力。
Data Clustering is one of the important research topics of data mining. Many people used meta-heuristic method to solve this NP-Hard problem. Previously, Shelokar is the first one to propose an ant colony approach for clustering. Because the heuristic information does not be used in this approach, more ants and more CPU time are required to attain the best solution. In this paper we improve this approach by adding heuristic information to the state transition rule. At the core of the algorithm we use both the accumulated pheromone and the heuristic information, the distances between data objects and cluster centers of ants, to guide artificial ants to group data objects into proper clusters. This allows the algorithm to perform the clustering process more effectively and efficiently. Due to the nature of stochastic and population-based search, our algorithm can overcome the drawbacks of traditional clustering methods that easily converge to local optima. Experimental results show that the algorithm is batter than Shelokar’s approach. Finally, we apply our algorithm to solve Cell Formation Problem and have a good result.