基於最近群聚與線性迴歸之完整與不完整資料模糊分群

在模糊分群的領域中，G-K演算法(Gustafson-Kessel Algorithm)對於所有群聚均給定了一個相同大小的橢圓判定區域(Ellipsoid Decision Region)。但由於訓練資料的分佈形式不盡不同，使得群聚分佈區域大小往往存在著差異，故對於判定區域的大小應進行適當調整以提升分群的準確率。對於G-K演算法之缺點，本論文提出對判定區域體積做適應學習之模糊分群演算法。透過粒子群最佳化演算法(Particle Swarm Optimization)對G-K演算法所求得之各群聚體積作最佳化調整，並以疊代方式更新，使判定區域能更精確的被學習而達到適應性學習(Adaptive Learning)的效果，以此降低不同的資料分佈形態可能造成的誤差。而在實際資料的取得時，可能因量測訊號微弱、操作疏失或儀器故障等因素使得所量測到之資料中部份維度值缺少，而造成了不完整資料(Incomplete Data)的形式。對於此一問題，目前已有許多不同的預測或處理策略被提出。而本文透過對原型資料之完整特徵資料的分析，以最近群聚資料，以及多元線性迴歸模型，結合具適應性體積之分群演算法提出兩個預測策略，來對不完整資料問題進行處理，同時以數個資料集，與不同的不完整資料處理策略比較以驗證本論文提出的方法。

關鍵字

G-K演算法；適應性學習；模糊分群；粒子群最佳化演算法；不完整資料

並列摘要

In fuzzy clustering field, Gustafson-Kessel Algorithm assumes a fixed volume for each ellipsoid decision regions. However, Due to the difference of data distribution, the volume of decision regions should be fine tuned for different kinds of clusters to improve the accuracy of fuzzy clustering. For the drawback of G-K algorithm, a fuzzy clustering algorithm with adaptive learning of ellipsoid decision regions is proposed in this thesis. By using Particle Swarm Optimization to find the optimal volume of each cluster, the decision region of each cluster can be adapted iteratively for different kinds of data distribution, thus the decision can be adjusted more correctly by the effect of adaptive learning and the error caused by different data distribution can be efficiently decreased. In real world, due to weak signal of measurement, improper operation or equipment malfunction, data features may have missing components. It leads to incomplete data problem. For solving incomplete problem, many strategies have been proposed. In this thesis, by analyzing complete part of prototype data, nearest cluster and multiple linear regression models, two improved strategies for estimating missing values are given. Numerical simulations of artificial and real data sets were used to verify the efficiency of proposed strategies.

並列關鍵字

G-K Algorithm ； Fuzzy Clustering ； Adaptive Learning ； Particle Swarm Optimization ； Incomplete Data

參考文獻

[1] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, New York: Plenum, 1981.

[2] R. Babuška, Fuzzy Modeling for Control, Kluwer Academic Publishers: Massachusetts, 1998.

[3] D. E. Gustafson and W. C. Kessel, “Fuzzy Clustering With a Fuzzy Covariance Matrix,” Proc. IEEE Conf. Decision Contr., San Diego, CA, 1979, pp.761-766.

[4] R. Krishnapuram and J. Kim, “A Note on the Gustafson-Kessel and Adaptive Fuzzy Clustering Algorithms,” IEEE Transactions on Fuzzy Systems, vol. 7, Issue: 4, Aug., 1999, pp.453-461.

[5] R. Babuška, P. J. van der Veen and U. Kaymak, “Improved Covariance Estimation for Gustafson-Kessel Clustering,” IEEE International Conference, vol. 2, May 2002, pp.1081-1085.

國際替代計量

基於最近群聚與線性迴歸之完整與不完整資料模糊分群

未授權

主題瀏覽