  • 學位論文


Using virtual sample and fuzzy clustering method for improving semi-supervised support vector machines

指導教授 : 姚志佳


支撐向量機提供了優異的分類性能,並已廣泛的應用在於真實世界的分類問題。在近幾年的研究當中,已經證明了半監督支撐向量機的性能。半監督支撐向量機同時使用已標記的樣本以及未標記的樣本來進行訓練,就結果而言此方法的結果更優於只使用已標記數據的方法。本論文將使用PIM Fuzzy C-means 的方法來對未標記的樣本進行標記,及使用新提出的虛擬樣本方法,借由此方法來進一步的改善PIM Fuzzy C-means的標記效果。在實驗中證明了此方法有效的改善了PIM Fuccy C-means的標記效果,及使用這些標記資料訓練半監督支撐向量機得到了更好的準確率,未來希望能夠使得模糊聚類方法加入虛擬樣本方法,讓資料分佈加的正確,讓PIM Fuzzy C-means得到更好的標記效果。


SVM provides excellent classification performance, and has been widely used in the classification field in the real-world. In recent years, study, the semi-supervised support vector machine has been demonstrated the performance. Semi-supervised support vector machines use the tagged samples and samples of unlabeled training. This paper will use PIM Fuzzy C-means approach to mark unlabeled samples. We also proposes a novel virtual sample methods to improve the mark effect of PIM Fuzzy C-means. The experimental results show that our proposed method can improve the effect. Based on these data mark training semi-supervised support vector machine can get a better accuracy rate. In the future, fuzzy clustering method maybe added to a virtual sample, so that the distribution of the information will be more correct and get better mark effect.


[1] B. Schölkopf and A. J. Smola, Learning with kernels: support vector machines, regularization, optimization, and beyond: MIT press, 2002.
[2] V. N. Vapnik and V. Vapnik, Statistical learning theory vol. 1: Wiley New York, 1998.
[3] L. E. Peterson, "K-nearest neighbor," Scholarpedia, vol. 4, p. 1883, 2009.
[4] J. C. Bezdek, R. Ehrlich, and W. Full, "FCM: The fuzzy c-means clustering algorithm," Computers & Geosciences, vol. 10, pp. 191-203, 1984.
[5] Z. Wu, H. Zhang, and J. Liu, "A fuzzy support vector machine algorithm for classification based on a novel PIM fuzzy clustering method," Neurocomputing, vol. 125, pp. 119-124, 2014.
