透過您的圖書館登入
IP:3.17.141.193
  • 學位論文

基於全域局部多尺度注意力網路之多類別遙測影像語意分割

Global Local Multi-Scale Attention based Network for Semantic Segmentation of Multi-Class Remote Sensing Images

指導教授 : 劉冠顯
共同指導教授 : 林春宏(Chuen-Horng Lin)
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來,隨著氣候異常導致全球災情頻傳,並且在旱災、洪水、高溫等的情況下導致部分地區地形發生變化,進而需要時刻掌握變化後的地形情況。因應上述情況,即時且準確標記衛星影像的區域就顯得十分重要,利用衛星影像結合深度學習來進行標記特定類別的面積,以實現經變化後衛星影像的重新標記任務。 衛星影像可以應用於土地覆蓋面積、土地利用、城市管理等問題上。以往我們進行衛星影像主要以人工或機械學習方式進行觀測,但近年來深度學習的發展越趨成熟,因此開始使用深度學習的方法來進行衛星影像的分類、物件偵測與分割的任務。 衛星影像之語意分割與自然影像分割有所差異,並且衛星影像的特徵豐富度與準確的定位之間對於衛星影像的語義分割至關重要。從更深層提取的特徵較為準確,但其影像細節也因更深的訓練而導致其特徵較為模糊;另一方面較低層提取的特徵提取出的特徵包含更多次要關注的影像細節,但特徵較為不明顯。因此,深層提取的特徵與次要關注的特徵在特徵和影像細節的差異,很難彌補它們之間的差距。 我們針對以上的問題提出基於全域局部多尺度注意力網路模型,並且我們提出兩個注意力模塊與新的上採樣方法,以幫助模型用於訓練與測試時能更有效的預測: 1. 關於第1個全局局部注意力模塊:全局局部注意力模塊幫助模型更有效的利用特徵圖在分割任務中針對全局與局部特徵圖分別進行特徵注意力增強,以提高整體模型的準確度。 2. 關於第2個多尺度注意力模塊:多尺度注意力模塊幫助模型更有效的利用多個不同感知視野針對特徵圖進行卷積與特徵注意力增強,以強化特徵圖的整體預測的準確度。 3. 我們提出一個新的上採樣方法,主要幫助模型在訓練過程中能結合兩種不同上採樣方向的特徵圖,更有效補足模型經上採樣後導致部分的特徵缺失。

並列摘要


In recent years, climate anomalies have led to frequent global disasters, and droughts, floods, high temperatures, etc. have caused terrain changes in some areas, and it is necessary to keep track of the changed terrain conditions. In response to the above situation, it is very important to immediately and accurately mark the area of satellite imagery. Using satellite imagery combined with deep learning to mark specific types of areas, in order to achieve the task of re-marking satellite imagery after changes. Remote sensing images are used for land coverage, land use, urban management, etc. In the past, we used remote sensing images to observe manually or mechanically. However, in recent years, the development of deep learning has become more and more mature. Therefore, we have begun to use deep learning methods for remote sensing images classification, object detection, and segmentation tasks. The semantic segmentation of remote sensing image is different from that of natural imagery, and the feature richness of remote sensing image and precise positioning are crucial to the semantic segmentation of remote sensing image. The high-level features extracted from deeper layers are more accurate, but the image details are also blurred due to deeper training; on the other hand, the low-level features extracted from lower features contain more secondary image details, but the features Less obvious. Therefore, the difference between high and low-level features in features and image details is difficult to make up for the gap between them. We propose Global Local Multi-Scale Attention Natural Network (GLMSA-Net) for the above problems, and we propose two attention modules and a new upsampling method to help the model predict more effectively when used for training and testing: 1. The first of Global Local Attention(GLA) module: The GLA module helps the model to use the feature map more effectively to enhance the feature attention of the global and local feature maps in the segmentation task, so as to improve the accuracy of the overall model. 2. The second of Multi-Scale Attention(MSA) module: The multi-scale attention module helps the model to more effectively use multiple different perceptual fields of view to perform convolution and feature attention enhancement on feature maps, so as to enhance the accuracy of the overall prediction of feature maps. 3. We propose a new upsampling method, which mainly helps the model to combine feature maps with two different upsampling directions during the training process, and more effectively complements the missing features of the model after upsampling.

參考文獻


[1] Myungsub Choi, Heewon Kim, Bohyung Han, Ning Xu, and Kyoung Mu Lee, “Channel attention is all you need for video frame interpolation,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 10663–10671.
[2] Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
[3] Ronald Kemker, Ryan Luu, and Christopher Kanan, “Low-shot learning for the semantic segmentation of remote sensing imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 10, pp. 6214– 6223, 2018.
[4] Haifeng Li, Kaijian Qiu, Li Chen, Xiaoming Mei, Liang Hong, and Chao Tao, “Scattnet: Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images,” IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 5, pp.905–909, 2021.
[5] Lei Ding, Hao Tang, and Lorenzo Bruzzone, “Lanet: Local attention embedding to improve the semantic segmentation of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 1, pp. 426–435, 2021.

延伸閱讀