透過您的圖書館登入
IP:18.190.156.212
  • 學位論文

基於卷積神經網路最佳化深度估測用於三維影片穩定之技術

3D Video Stabilization with Depth Estimation by CNN-based Optimization

指導教授 : 洪一平

摘要


影片穩定技術是將晃動不穩定的影片消除抖動,並且需保留影片原有的主運動,是一項提升影片視覺品質的基本必備技術。以往的影片穩定方法大多基於二維影像平面上的轉換,因此難以處理場景深度差異過大的影片,進而在影片結果中產生扭曲的現象。我們提出了一個基於三維空間資訊進行穩定轉換的深度學習方法。我們首先利用兩個卷積神經網路構成的最佳化框架針對一部輸入影片估測其場景的深度以及相機的三維運動軌跡,其最佳化框架並不需要預先的訓練以及訓練資料,而是直接在實際使用階段對輸入影片進行學習並且最佳化估測結果。接著,將最佳化所得的相機運動軌跡做平滑處理後,根據估測的三維場景重建穩定的影片結果。其中,在平滑處理的演算法中,我們提供使用者針對同一部輸入影片進行即時調整其穩定度的功能,其為目前多數深度學習方法沒有提供的功能。據我們所知,我們的方法是第一個基於三維空間轉換的深度學習方法。並且透過質化與量化的方法與其他最先進的影片穩定方法進行比較,呈現出我們的優勢所在。

並列摘要


Video stabilization is to remove the noisy motion and preserve the primary motion from an unsteady video, which is an essential technique for enhancing the visual quality of videos. Most of the prior works are based on 2D transformation models so that they would suffer from the scenarios with complex scene depth. We present a novel 3D-based learning method for video stabilization. The proposed method estimates the scene depth and 3D camera motion with a CNN optimization framework and without needing pre-training and training data. After obtaining estimated depth and camera motion, the stabilization process performs smoothing algorithm on camera trajectory and synthesizes the stabilized video with 3D scene depth. Furthermore, the smoothing algorithm enables user to manipulate the stability of the same video in real time (34.5 fps), which is a fundamental function but most of the prior learning-based methods do not provide the flexibility. To the best of our knowledge, our work is the first learning method based on 3D motion model. We show the advantages of our 3D-based method quantitatively and qualitatively comparing to the state-of-the-art method.

參考文獻


[1] K.-Y. Lee, Y.-Y. Chuang, B.-Y. Chen, and M. Ouhyoung, "Video stabilization using robust feature trajectories," in 2009 IEEE 12th International Conference on Computer Vision, 2009: IEEE, pp. 1397-1404.
[2] M. Grundmann, V. Kwatra, and I. Essa, "Auto-directed video stabilization with robust l1 optimal camera paths," in CVPR 2011, 2011: IEEE, pp. 225-232.
[3] S. Liu, L. Yuan, P. Tan, and J. Sun, "Bundled camera paths for video stabilization," ACM Transactions on Graphics (TOG), vol. 32, no. 4, pp. 1-10, 2013.
[4] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng, "Meshflow: Minimum latency online video stabilization," in European Conference on Computer Vision, 2016: Springer, pp. 800-815.
[5] F. Liu, M. Gleicher, H. Jin, and A. Agarwala, "Content-preserving warps for 3D video stabilization," ACM Transactions on Graphics (TOG), vol. 28, no. 3, pp. 1-9, 2009.

延伸閱讀