適應性K最近鄰演算法

傳統的K-最近鄰(K-Nearest Neighbors，簡稱KNN)分類演算法是使用一個固定的K值，由最相近的K個鄰居中，投票決定受測資料應歸屬於哪一個類別。然而，相關研究顯示，變動的K值可改善KNN的分類效果。因此，本研究在KNN分類演算法中，加入Local KNN及Fuzzy C-means歸屬程度值的概念，讓個別測試資料使用較適合其本身的K值，進而改善整體分類效果。

關鍵字

K最近鄰演算法；區域的K最近鄰演算法；模糊C平均分群演算法；網格；密度

並列摘要

The K-nearest-neighbor algorithm traditionally predicts the class of a record based on the decision from the K nearest neighbors of the record, for a fixed K value. However, recent studies showed that using different K values for different records could improve the prediction accuracy. This study integrates Fuzzy C-means algorithm to assist determining a proper K value for each record in a local KNN algorithm. Performance results show this method outperforms the traditional KNN in term of prediction accuracy.

並列關鍵字

KNN ； Local KNN ； Fuzzy C-means ； Grid ； Density

參考文獻

[1] MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297.

[6] McLachlan, G.J., & Krishnan, T. (1997). The EM Algorithm and Extensions. NJ: Wiley Publisher.

[7] Mackinnon, M.J., & Glick, N. (1999). Data mining and knowledge discovery in databases - An overview. Australian and New Zealand Journal of Statistics, 41(3), 255-275.

[9] Webb, R.A. (2002). Statistical Pattern Recognition (2nd ed.). NJ: Wiley Publisher.

[13] Huang, J.Z., Ng, M.K., & Rong, H., Li, Z. (2005). Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 657-668.

國際替代計量

適應性K最近鄰演算法

主題瀏覽