Multiscale Convolution based Repeat Fusion Network for Real-time Semantic Segmentation

For practical applications of semantic segmentation tasks, such as autonomous driving, we hope that it should be able to process high-resolution images quickly and with high accuracy. This is a challenging goal. In order to design such an algorithm, we need to solve the fusion problem and contradiction between high-resolution spatial positioning information and low-resolution semantic classification information in the semantic segmentation task. For the above problems, we propose the multiscale convolution based repeat fusion network (MC-RFNet). For the problem of missing multiscale information and insufficient receptive field, we propose the separable multiscale convolutional module, so that each layer of the network has the ability to capture multiscale information. In view of the situation that shallow information is difficult to directly recover resolution the high-resolution feature map, we design the repeat fusion module of high and low resolution. On the one hand, we reduce the occupation of computing resources generated directly calculated on high-resolution feature maps, and on the other hand, high-resolution maps gradually have deep semantic information through fusions and convolutions.

關鍵字

Real Time ； Semantic Segmentation ； Fusion

參考文獻

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and LC. Chen. MobileNetV2: Inverted residuals and linear bottlenecks, IEEE Conference on Computer Vision and Pattern Recognition, (2018), p. 4510-4520.

LC. Chen, Y. Zhu, G. Papandreou, F. Schroff and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation, IEEE Conference on Computer Vision and Pattern Recognition, (2018).

H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia. Pyramid scene parsing network, IEEE Conference On Computer Vision and Pattern Recognition, (2017), p. 6230-6239.

A. Howard, M. Sandler, G. Chu, LC. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang and V. Vasudevan. Searching for mobilenetv3, IEEE International Conference on Computer Vision, (2019), p. 1314-1324.

M. Fan, S. Lai, J. Huang, X. Wei, Z. Chai, J. Luo and X.Wei. Rethinking bisenet for real-time semantic segmentation, Computer Vision and Pattern Recognition, (2021), p. 9716-9725.

國際替代計量

Multiscale Convolution based Repeat Fusion Network for Real-time Semantic Segmentation

全文下載

主題瀏覽