處理方向性資料的可能性C均值聚類法

在這個科技發達的時代，聚類分析已經是一種常見的數據分析工具，而現有的聚類分析演算法更是五花八門，常見的演算法如:硬式分割、模糊C均值等，能有效的處理一般資料，為了因應各種資料型態，也衍生出了許多的聚類演算法，所以在處理不同資料的時候，選擇相應的演算法是非常重要的。本文除了會介紹模糊C均值(FCM)和可能性C均值(PCM)之外，並且比較兩個演算法，而重點將會放在把可能性C均值拓展到方向性資料上，並稱之為方向性資料之可能性C均值(Directional Data-PCM)，在保留可能性C均值優點的同時，讓其能夠有效的處理方向性資料。在文章的最後會先用各種模擬資料的例子展示方向性資料之可能性C均值演算法的優點，除了能夠得到良好的聚類效果之外，還能夠找出各個資料的離群點。再將其運用在真實例子中，並從已知的分群結果與方向性資料之可能性C均值演算法得出的分群結果比較，就可以得知此演算法能做到有效的聚類結果。

關鍵字

模糊C均值；可能性C均值；方向性資料；可能性C均值處理方向性資料

並列摘要

In this era of technological advancement, cluster analysis is a commonly used tool for analyzing data. There have been lots of existing algorithms such as K-means and fuzzy c-means which can effectively handle data in general. Specifically, many clustering algorithms were designed for some kind of data. Thus, it is very important to choose a suitable algorithm for dealing with different materials. In this thesis, we propose a new algorithm called the possibilistic c-means of directional data, DD-PCM, for clustering directional data based on the PCM algorithm. We first review and compare two famous clustering algorithms, the fuzzy c-means and the possibilistic c-means algorithm. Then we focus on extending the use of the possibilistic c-means to the directional data. Directional data are represented by polar coordinates and then the possibilistic c-means is applied to angles for clustering data into proper classes. The new proposed algorithm does not only inherit the advantages of the possibilistic c-means algorithm but also handle directional data effectively. We use the examples of various simulation data to demonstrate the advantage of the possibilistic C-means of directional data. Then we will use the examples of real data to test show the practicability of the possibilistic c-means of directional data algorithm. Experimental results actually show the effectiveness of the proposed algorithm in dealing directional data.

並列關鍵字

fuzzy c-means ； possibilistic c-means ； directional data ； DD-PCM

參考文獻

[1] Bezdek, J.C. (1973), Fuzzy Mathematics in Pattern Classification, Ph.D. Thesis, Applied Math. Center, Cornell University, Ithaca, USA.

Google Scholar

[2] Dunn, J.C. (1974), A fuzzy relative of the ISODATA process and its use in detecting compact, well-separated clusters, Journal of Cybernetics, 3, 32-57.

Google Scholar

[3] Fisher, N.I., Lewis, T., Embleton, B.J.J. (1987), Statistical analysis of spherical data, Cambridge University.

Google Scholar

[4] Krishnaouram, R. and Keller, J.M. (1993), A possibilistic approach to clustering, IEEE Trans. Fuzzy Systems, 1, 98-110.

Google Scholar

[5] Krishnaouram, R. and Keller, J.M. (1996), The possibilistic c-means algorithm: Insights and recommendations, IEEE Trans. Fuzzy Systems, ４.

Google Scholar

延伸閱讀

Chu, C. Y. (2018). 區間資料之穩健式模糊C均值聚類演算法 [master's thesis, Chung Yuan Christian University]. Airiti Library. https://doi.org/10.6840/cycu201800251
Benjamin, J. B. M. (2021). 多視圖調整式可能性C均值聚類算法 [doctoral dissertation, Chung Yuan Christian University]. Airiti Library. https://doi.org/10.6840/cycu202100362
Tsai, C. L. (2017). 架構在共變異數之加權模糊C-均數演算法以其在彩色影像分割之應用. 智慧科技與應用統計學報, 15(2), 13-30. https://www.airitilibrary.com/Article/Detail?DocID=1812433x-201712-201802230011-201802230011-13-30
Mohanavalli, S., & Jaisakthi, S. (2015). A Precise Distance Metric for Mixed Data Clustering using Chi-square Statistics. Research Journal of Applied Sciences, Engineering and Technology, 10(12), 1441-1444. https://www.airitilibrary.com/Article/Detail?DocID=20407467-201508-201510070018-201510070018-1441-1444
Mbarki, J., Jaara, E. M., Eljasouli, S. Y., Mbarki, J., & Jaara, E. (2016). A New Clustering Algorithm of Data Mining. Research Journal of Applied Sciences, Engineering and Technology, 13(6), 427-431. https://doi.org/10.19026/rjaset.13.3002

國際替代計量

處理方向性資料的可能性C均值聚類法

不提供下載

主題瀏覽