在這個科技發達的時代,聚類分析已經是一種常見的數據分析工具,而現有的聚類分析演算法更是五花八門,常見的演算法如:硬式分割、模糊C均值等,能有效的處理一般資料,為了因應各種資料型態,也衍生出了許多的聚類演算法,所以在處理不同資料的時候,選擇相應的演算法是非常重要的。本文除了會介紹模糊C均值(FCM)和可能性C均值(PCM)之外,並且比較兩個演算法,而重點將會放在把可能性C均值拓展到方向性資料上,並稱之為方向性資料之可能性C均值(Directional Data-PCM),在保留可能性C均值優點的同時,讓其能夠有效的處理方向性資料。 在文章的最後會先用各種模擬資料的例子展示方向性資料之可能性C均值演算法的優點,除了能夠得到良好的聚類效果之外,還能夠找出各個資料的離群點。再將其運用在真實例子中,並從已知的分群結果與方向性資料之可能性C均值演算法得出的分群結果比較,就可以得知此演算法能做到有效的聚類結果。
In this era of technological advancement, cluster analysis is a commonly used tool for analyzing data. There have been lots of existing algorithms such as K-means and fuzzy c-means which can effectively handle data in general. Specifically, many clustering algorithms were designed for some kind of data. Thus, it is very important to choose a suitable algorithm for dealing with different materials. In this thesis, we propose a new algorithm called the possibilistic c-means of directional data, DD-PCM, for clustering directional data based on the PCM algorithm. We first review and compare two famous clustering algorithms, the fuzzy c-means and the possibilistic c-means algorithm. Then we focus on extending the use of the possibilistic c-means to the directional data. Directional data are represented by polar coordinates and then the possibilistic c-means is applied to angles for clustering data into proper classes. The new proposed algorithm does not only inherit the advantages of the possibilistic c-means algorithm but also handle directional data effectively. We use the examples of various simulation data to demonstrate the advantage of the possibilistic C-means of directional data. Then we will use the examples of real data to test show the practicability of the possibilistic c-means of directional data algorithm. Experimental results actually show the effectiveness of the proposed algorithm in dealing directional data.