近年來隨著科技的進步,高維度資料所具有的資料量也越來越龐大,為解決巨量資料的運算複雜度問題,利用特徵抽取來降低資料維度為其中一種方法。而應用到高光譜影像上,為避免波段數增加所導致的Hughes現象影響分類正確率,透過波段選取進行高光譜影像的資料前處理並降低運算複雜度。 過去曾有多位學者提出幾種以最佳化演算法為主的波段選取方法,但所定義的波段聚合標準較為嚴苛,使降維效果無法大幅提升。因此,本論文提出一個聚合標準較為寬鬆的監督式波段選取方法,先以粒子群優法為主軸、相關係數矩陣為輔分別聚合各類別的高相關度波段,再配合非純度波段優先權法將各類別已聚合的高相關度波段統計其單一類別和整體類別波段之間的關係,最後將統計的結果以多數決的方式挑出代表性波段,使巨量資料降低資料維度。 本文採用鰲鼓溼地的 MASTER 遙測影像以及 Northwest Tippecanoe County 的AVIRIS 遙測影像為實驗圖資,最後由實驗結果可以得知,本文所提出的方法能夠有效地挑選出具各類別代表性的波段且能有效的降低資料維度,並透過分類器得到不錯的分類效果。
In recent years, with the improvement of technologies, the numbers of data in the high dimension data sets are also increased and using feature extraction is one of the methods that can reduced dimension to solve the big data computational complexity problem. For hyperspectral imagery, in order to avoid increased the number of bands that cause Hughes Phenomena to classification accuracy decreased, using band selection as data pre-processing to reduced computational complexity in hyperspectral image data sets. There have been a number of scholars to propose several optimization algorithms-based band selection methods, but the selection methods were too stringent that cause dimensionality couldn’t be significantly reduced. Therefore, a new supervised band selection algorithm based on particle swarm optimization (PSO) is proposed in this paper. By using the PSO algorithm, the highly correlated bands of hyperspectral imagery can be grouped into the same modules, with a band prioritization method for statistics relationship between bands and classes, and finally selection bands from the statistics result by using weighted majority voting to reduced dimension in big data sets. Furthermore, the high correlation band information is integrated with the PSO algorithm during process updating phase that can make PSO algorithm more powerful in searching the large modules. The effectiveness of the proposed method is evaluated by MASTER and AVIRIS hyperspectral images. The experimental results demonstrated that the proposed method not only could reduction the dimension of data sets by band selection, but also can offer a satisfactory classification performance.