透過您的圖書館登入
IP:3.147.103.202
  • 學位論文

一種應用鄰近關係的特徵擷取演算法

A Novel Feature Selection Method based on the Neighborhood Relation

指導教授 : 周建興

摘要


近年來特徵選取逐漸應用於成千上萬的資料集中提取重要的特徵的領域中,這些領域包括:文字辨識、基因陣列、生醫信號分析。 特徵選取在模式識別與機器學習中扮演至關重要的角色。在眾多特徵選取的方法中,循序向前搜尋法(SFS)和循序向後搜尋法(SBS)是最廣泛採用的方法,本文提出的方法是根據鄰近關係結合於循序向前搜尋法(SFS)以及循序向後搜尋法(SBS),我們開發一個以鄰近關係概念為基礎的特徵選取方式,依照每個特徵之前後特徵(一維資料)或上下左右特徵(二維資料)關係給予遠近權重值,在依據其遠近權重值進行排名,照先後順序選取辨識率能提高之特徵,再將其餘不重要的特徵再進一步予以剔除,如此篩選出重要特徵子集以提升辨識率,並且特徵將會聚集在關鍵的區域當中。 本論文中,實驗部分我們分別針對文字辨識以及大鼠腦波進行方法模擬測試,在大鼠腦波中,我們針對大鼠清醒狀態(AW)、慢波睡眠(SWS)、快速眼動睡眠(REM)狀態三個行為的腦波狀態做特徵選取。另外在文字辨識中,我們對中文字中的「太」、「大」、「犬」進行實驗分析,分別為「太」、「大」一組以及「大」、「犬」一組以及「太」、「大」、「犬」一組。最後本論文所提出的方法可以將上述實驗提升其辨識率並且所找出來的特徵子集也將落於關鍵的區域當中。

並列摘要


Recently, feature selection have been applied to many area which have housands of dataset. Those area is text categorization , microarrays , biomedical signal analysis. Feature selection in pattern recognition and machine learning to play a crucial role. Law in a number of feature selection method, the sequential forward search (SFS) and sequential backward search (SBS) is the most widely used method, the proposed method is based on proximity combine in sequential forward search method (SFS), as well as sequential backward search (SBS), we have developed a feature selection based on the concept of proximity, in accordance with the characteristics (one-dimensional data) before and after each feature or features (2D data), the relationship between up and down to give the distance weight value, ranking, based on its distance weight value according to the order to select the recognition rate can improve the characteristics, and then the rest of the unimportant features further be removed, so filter out the important feature subset in order to enhance the recognition rate, and characteristics will be gathered at a critical the area. In this paper, the experimental part we were had simulation test for character recognition, as well as rat brain waves in the rat brain waves for the rats awake state (AW), slow wave sleep (SWS) and rapid eye movement sleep (REM) brainwave state of the three acts of the state to do feature selection. In addition, character recognition, text "太", "大", "犬" experiment, respectively, as a group of "太"大"and"大","犬"a group of too, "太" and"大"and"犬"group. Finally the proposed method in this paper the above experiments to enhance the recognition rate and are looking out feature subset which fall in the critical region.

並列關鍵字

weight value SFS SBS text categorization

參考文獻


[1] H. Liu, J. Li and L. Wong, “A Comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns,” Genome Informatics, vol. 13, pp. 51-60, 2002.
[2] T. Li, C. Zhang and M. Ogihara, “A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression,” Bioinformatics, vol. 20, no. 15, pp. 2429-2437, 2004.
[3] E. P. Xing, M. I. Jordan and R. M. Karp, “Feature selection for high-dimensional genomic microarry data,” International Conference on Machine Learning, pp. 601-608, 2001.
[4] Y. Yang and J. O. Pedersen, “A comparative study on feature selection in text categorization,” International Conference on Machine Learning, pp. 412-420, 1997.
[5] T. Liu, S. Liu, Z. Chen and W. Y. Ma, “An evaluation of feature selection for text categorization,” International Conference on Machine Learning, 2003.

延伸閱讀