在本論文中,我們使用深度學習來預測光場圖像的深度。光場相機可以同時拍攝一個場景的光線特性:包含空間以及角度。藉由拍攝到的這些資訊,我們可以去預估場景的深度。但是光場相機結構上圖像與圖像之間狹窄的baseline造成深度估計上面的困難,現在很多方法都想要解決這個硬體上面的限制,不過仍然需要在執行速度以及估計的準確率上面達到平衡。 因此,本論文考慮了光場圖像在資料上面的結構性以及圖像上的重複性,將這些特性的概念設計到我們的深度學習網路當中。再來我們提出了attention based sub-aperture view selection來讓網路自行學習哪一些圖像對於深度估計的貢獻是更大的,最後我們比較了在benchmark上和其他states of the art方法之間的比較,來顯示我們對於這個題目的改進。
In this paper, we introduce a light field depth estimation method based on a convolutional neural network. Light field camera can capture the spatial and angular properties of light in a scene. By using this property, we can compute depth information from light field images. However, the narrow baseline in light-field cameras makes the depth estimation of light field difficult. Many approaches try to solve these limitations in the depth estimation of the light field, but there is some trade-off between the speed and the accuracy in these methods. We consider the repetitive structure of the light field and redundant sub-aperture views in light field images. First, to utilize the repetitive structure of the light field, we integrate this property into our network design. Second, by applying attention based sub-aperture views selection, we can let the network learn more useful views by itself. Finally, we compare our experimental result with other states of the art methods to show our improvement in light field depth estimation.