運用多元卷積類神經網路架構於多光譜遙測影像之物質分類

為了達到實境模擬之效果，地形模型必須結合各種物質以及紋理資訊，因此地形重建(terrain reconstruction)在模擬三維地形(three-dimensional numerical simulation of terrain)中，將扮演重要的角色。然而，若採用傳統方式建置模型，往往會耗費龐大的人力與時間。因此，本研究採用卷積類神經網路(convolutional neural network, CNN)架構進行多光譜遙測影像的物質分類，以便簡化未來模型的建置。本研究之多光譜遙測影像包含RGB可見光、近紅外線(near infrared, NIR)、常態化差異植被指數(normalized difference vegetation index, NDVI)以及數值地表模型(digital surface model, DSM)影像。本文提出運用多元卷積類神經網路架構之RUNet模型，進行多光譜遙測影像之物質分類。RUNet模型是基於改良式U-Net的架構，並結合ResNet模型的Shortcut Connections方法來保留淺層網路提取的特徵。RUNet架構分為編碼層(encoder)與解碼層(decoder)，編碼層包含10個卷積層及4個池化層，解碼層則有4個上取樣層、8個卷積層及1個分類卷積層。本文的物質分類流程包含RUNet模型的訓練和測試，由於遙測影像的尺寸較大，訓練過程係從訓練集隨機裁切相同尺寸的子影像，再輸入至RUNet模型訓練。測試過程為了考量物質的空間資訊，係透過鏡射補值(mirror padding)和重疊裁切(overlap cropping)的方式，從測試集中裁切出多張測試子影像，再由RUNet對子影像進行物質分類，最後再將子影像的分類結果合併回原測試影像。為了評估本文方法之效能，係採用Inria、Inria-2和ISPRS遙測影像資料集，並由RUNet模型進行物質分類實驗，亦分析鏡射補值和重疊裁切方法之效果，以及子影像尺寸對物質分類的影響。結果顯示，在Inria資料集的實驗中，RUNet之分類結果經過形態學優化之後，總體IoU達到約70.82%，準確率約為95.66%，優於其他研究方法。Inria-2資料集實驗的分類結果，在優化之後總體IoU約為75.5%，準確率約為95.71%，雖然改良式FCN有較佳的結果，但RUNet模型所花費的訓練時間較少。而在ISPRS資料集的實驗中，結合多光譜、NDVI和DSM影像的總體準確率達到約89.71%，優於使用RGB影像的分類結果。NIR和DSM能夠提供更多的物質特徵資訊，有效改善RGB影像中因為相同顏色、形狀或紋理特徵而造成的分類混淆。經過實驗證明，本文方法在遙測影像的物質分類，相較其他研究方法有較佳的結果，期望未來能夠應用於模擬系統之模型建置、土地利用監測，以及災害評估等領域。

關鍵字

地形重建；遙測影像；多光譜影像；卷積類神經網路；物質分類

並列摘要

In order to achieve the effect of real-world simulation, terrain models must combine various material and texture information, so that terrain reconstruction may play an important role in the three-dimensional numerical simulation of terrain. However, if the model is built in the traditional way, it will often cost a lot in terms of manpower and time. Therefore, this study uses a convolutional neural network (CNN) architecture to classify material in multispectral remote sensing images to simplify the construction of future models. The multispectral remote sensing image of this study includes RGB visible light, near infrared (NIR), normalized difference vegetation index (NDVI) and digital surface model (DSM) images. This paper proposes the use of the RUNet model of multiple convolutional neural network architectures for material classification. The RUNet model is based on an improved U-Net architecture combined with the Shortcut Connections approach of the ResNet model to preserve the features of shallow network extraction. The architecture is divided into an encoding layer and a decoding layer. The encoding layer includes 10 convolution layers and 4 pooling layers. The decoding layer has 4 upsampling layers, 8 convolution layers, and one classified convolution layer. The material classification process in this paper includes the training and testing of the RUNet model. Due to the large size of the remote sensing image, the training process randomly cuts sub-images of the same size from the training set and then inputs them into the RUNet model for training. In order to consider the spatial information of the material, the test process cuts multiple test sub-images from the test set by mirror padding and overlap cropping, the RUNet then classifies the sub-images, and finally merges the sub-image classification results back into the original test image. In order to evaluate the effectiveness of the method, the Inria, Inria-2 and ISPRS remote sensing image datasets were used, and the RUNet model was used for material classification experiments. The effects of the mirror padding and overlap cropping methods were also analyzed, as well as the impact of sub-image size on material classification. The results showed that in the Inria dataset experiment, after the morphological optimization of RUNet, the overall IoU reached about 70.82%, and the accuracy rate was about 95.66%, which was better than the results in other research methods. The classification results of the Inria-2 dataset experiment showed that the overall IoU was about 75.5% and the accuracy was about 95.71% after optimization. Although the improved FCN has better results, the RUNet model takes less training time. In the ISPRS dataset experiment, the overall accuracy of combining multispectral, NDVI and DSM images reached approximately 89.71%, which is superior to the classification results using RGB images. NIR and DSM can provide more material features information, effectively improving the classification confusion caused by the same color, shape or texture features in RGB images. Experiments have prove that the material classification of our method in remote sensing images achieves better results than other research methods do, and it is expected to be applied to the model construction of the simulation system, land use monitoring, and disaster assessment in the future.

並列關鍵字

terrain reconstruction ； remote sensing image ； multispectral image ； convolutional neural network ； material classification

參考文獻

[1] 江孟璁（2009）。房屋模型面與空載影像之套合（碩士論文）。取自http://handle.ncl.edu.tw/11296/d759wc。搜尋日期：2019年1月14日。

Google Scholar

[2] Habib, A. F., and Alruzouq, R. I. (2004). Line‐based modified iterated Hough transform for automatic registration of multi‐source imagery. The Photogrammetric Record, 19(105), 5-21.

Google Scholar

[3] Bentoutou, Y., Taleb, N., Kpalma, K., and Ronsin, J. (2005). An automatic image registration for applications in remote sensing. IEEE Transactions on Geoscience and Remote Sensing, 43(9), 2127-2137.

Google Scholar

[4] Habib, A., Ghanma, M., and Mitishita, E. (2004). Co-registration of photogrammetric and LIDAR data: Methodology and case study. Revista Brasileira de Cartografia, 1(56).

Google Scholar

[5] 施介嵐（2003）。以光譜混合分析法進行台灣地區 Master 影像之研究（碩士論文）。取自http://hdl.handle.net/11536/49269。搜尋日期：2019年1月14日。

Google Scholar

國際替代計量

運用多元卷積類神經網路架構於多光譜遙測影像之物質分類

未授權

主題瀏覽