有相當多的文獻在探討分類的問題,如:分割法、階層式法(hierarchical)、k-均值法(k-means)以及模糊c均值(fuzzy c-means, FCM)……等,都可以幫助我們對資料做分類。然而資料的內容是相當多樣化的,可能有實數型、符號型或模糊型的資料,對於上述所敘述資料已有相當多的文獻針對各種不同類型的資料,分別採用各種分類的方法來求得其結果。 因此本文想針對混合型資料(即包含實數型、符號型或模糊型的資料)來分類,採用模糊c均值(fuzzy c-means, FCM)[4]的演算法來求得其分類的結果。本研究主要以Diday[3]及Gowda & Diday[5,6]所提出符號型距離的定義,以及Hathaway & Bezdek[8] 所提出模糊型距離的定義來著手,在過程中分別發現其距離的定義與直觀相違背,因此我們作適度的修正,在修正後皆得到很好的結果。 最後,我們試著找一個汽車的實例,其中包含著混合型資料的型態,使用我們所提出的方法分類,也得到相當好的結果。因此以後只要取得混合型或個別的資料我們皆可採用此法得到分類的結果。
There are several methods for clustering of data, such as divisive , hierarchical, k-means, and fuzzy c-means methods, etc. However ,these methods are must used for numerical data. There are few documents dealing with mixed types of numerical , symbolic and fuzzy data. This thesis presents fuzzy clustering algorithms for the mixed type of data (i.e., composed of numerical, symbolic, and fuzzy data) by adopting fuzzy c-means (FCM) [4]. It is mainly based on the definition of symbolic distance proposed by Diday [3] and Gowda & Diday [5,6], and on also the definition of fuzzy distance proposed by Hathaway & Bezdek [8]. The fact that these two distances come against on intuition is found during the process. Therefore, an appropriate amendment is made, and better results are given. At last, a real example is given. The mixed type of data is included. By adopting the method proposed in this thesis, good results are generated. That is, the proposed method can be adopted to classify both well mixed type and individual type of data.