透過您的圖書館登入
IP:18.225.11.98
  • 學位論文

一種改良式特徵挑選方法及其於分類分析之應用

An Improved Feature Selection Approach and Its Application on Classification Analysis

指導教授 : 歐陽振森

摘要


本研究提出一種結合最大訊息係數與K中心分群之改良式特徵挑選方法,藉以由大量的特徵中挑選出具鑑別度之特徵,並可進一步建立準確之分類模型。首先,利用最大訊息係數計算每對特徵之間的相似度,以及每個特徵與類別之間的相似度。接著,透過K中心分群演算法將所有特徵分成K群,並由每群的特徵中挑選一個與類別之間具有最大訊息係數之特徵做為代表。最後,可以獲得K個代表性特徵。為了驗證我們方法之優越性,實驗中考慮了四個不同特性的分類資料集與四個分類器,並與其他三種特徵挑選方法進行比較。相較於其他特徵挑選方法,實驗結果顯示我們方法在大多數資料集與分類器之組合條件下具有較好之分類表現。

並列摘要


In this study, an improved feature selection approach is proposed by combining the maximum information coefficient (MIC) and k-medoids clustering, so as to select discriminative features from numerous features and further construct precise classification models. To begin with, the similarity between each pair of features and the similarity between each feature and classes are calculated with the MIC. Then, all features are grouped into K clusters through the k-medoids clustering and a representative feature which possesses the maximum MIC value with the class variable is selected from each cluster. Finally, K representative features are obtained. To verify the superiority of our approach, four datasets with different properties and four classifiers are considered. Also, comparisons between our approach and other three feature selection approaches are made. Compared with other approaches, experimental results show that our approach presents the better classification performance in a majority of combined conditions of different datasets and different classifiers.

參考文獻


[1] D. N.Reshef et al., “Detecting Novel Associations in Large Data Sets,” Science (80-. )., vol. 1518, no. December, pp. 1518–1524, 2011.
[2] C.SenOuyang, “Feature selection with a supervised similarity-based k-medoids clustering,” Proc. - Int. Conf. Mach. Learn. Cybern., vol. 2, pp. 562–566, 2014.
[3] J. Y.Jiang, Y. L.Su, andS. J.Lee, “MIKM: A mutual information-based K-medoids approach for feature selection,” Proc. - Int. Conf. Mach. Learn. Cybern., vol. 1, pp. 102–107, 2011.
[4] FontTian, “Maximal Information Coefficient (MIC)最大互信息系数详解与实现,” 知乎, 2018. [Online]. Available: https://zhuanlan.zhihu.com/p/53092905. [Accessed: 25-Apr-2019].
[5] J. A.Hartigan andJ.A., Clustering algorithms. Wiley, 1975.

延伸閱讀