透過您的圖書館登入
IP:3.144.127.232
  • 學位論文

即時性兩階段顯著物偵測演算法及其異質性積體電路架構設計

Real-time Salient Object Detection with Two-Stage Algorithm and Heterogeneous Hardware Architecture

指導教授 : 簡韶逸

摘要


顯著性物體偵測(salient object detection)切割出影像當中最顯著的物體,以降低影像運算的複雜度和傳輸要求,是為許多電腦視覺應用以及物聯網(IoT)中近邊緣裝置(edge device)資料傳輸的重要預處理步驟。 然而,現有的演算法並不適合實現於在有限運算資源的裝置上。傳統演算法通常建立於人為定義特徵及使用超像素(super pixel)來降低運算量,上述步驟涉及不規律的記憶體資料讀取。近期基於深度學習的顯著性物體偵測演算法表現出卓越的精確度, 但同時卷積神經網絡(Convolutional Neural Network)需要儲存大量的參數和計算,其數量無法為一般終端裝置所負擔。我們為此提出了應用於可攜式裝置之及時顯著性體偵測演算法設計與硬體架構設計。 我們提出二階段顯著性物體偵測,首先使用輕量級的卷積神經網絡預測粗略的顯著物體區域,再對其使用基於引導式濾波器(guided filter)設計的邊緣精細化演算法修正物體邊緣的切割。 我們證實了與現有演算法相比,二階段顯著性物體偵測演算法取得良好的表現。另外就我們所知,本論文為第一個針對顯著性物體偵測切割的特殊應用進行積體電路設計實作。本設計花費 12 KB 的晶片內建記憶體(on-chip memory)和 364.6K 邏輯閘,並可每秒處理 75 幀。

關鍵字

顯著物偵測

並列摘要


Saliency detection, or salient object detection, is an essential pre-processing step for many computer vision applications. It extracts the most conspicuous part of an image and reduces the computation and transmission requirement. This ability is desired for end devices with limited hardware resources. However, existing algorithms are not suitable for hardware implementation. Traditional works usually build upon manually designed priors, and their computations usually involve irregular memory access. Recently, deep learning based algorithms have demonstrated superior performance, while they require a large number of parameters and computation. We propose a hardware solution for salient object detection. Our algorithm first uses a lightweight CNN to predict a coarse saliency map, which is then refined to obtain the boundary-accurate saliency map. We demonstrate that our two-stage algorithm can achieve favorable performance compared to existing methods. To the best of our knowledge, this work is also the first ASIC design for salient object detection. The proposed hardware design costs 12 KB on-chip memory and 364.6K logic gates which runs at 75 FPS.

並列關鍵字

salient object detection

參考文獻


[1] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 35, no. 6, pp. 1397–1409, June 2013. xiii, 12, 15, 22
[2] C. Guo and L. Zhang, “A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression,” IEEE transactions on image processing, vol. 19, no. 1, pp. 185–198, 2010. 1
[3] Q.-G. Ji, Z.-D. Fang, Z.-H. Xie, and Z.-M. Lu, “Video abstraction based on the visual attention model and online clustering,” Signal Processing: Image Communication, vol. 28, no. 3, pp. 241–253, 2013. 1
[4] Y. J. Lee, J. Ghosh, and K. Grauman, “Discovering important people and objects for egocentric video summarization,” in Proceedings of IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR). IEEE, 2012, pp. 1346–1353. 1
[5] Y. Ding, J. Xiao, and J. Yu, “Importance filtering for image retargeting,” in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011, pp. 89–96. 1

延伸閱讀