透過您的圖書館登入
IP:216.73.216.225
  • 學位論文

用於光場資料之頻率域數位變焦演算法與其硬體實現

A Frequency-Based Digital Refocusing Algorithm and its Hardware Implementation for Light Field Data

指導教授 : 盧奕璋

摘要


傳統照相技術所得到的影像資料,是將空間中的高維度光線資訊投影至二維感光器上,因此僅為完整光場的一部分。假使我們能經計算還原出空間中的完整光場資料,便能對其進行更多的應用。   在這篇論文中,我們藉由在感光器與鏡頭間放置光罩,將Nikon D700相機改造成可記錄完整光場資訊的光場相機。當光線經過光罩,在空間域中等同於計算光線函式與光罩函式兩者乘積。依照卷積定理(convolution theorem)將乘積轉換到頻率域上,會等於將兩個函式分別轉至頻率域後進行卷積。因此如果利用具有脈衝函式的針孔陣列光罩複製光場的頻譜資訊,再改變頻率域中的取值位置,就可以把頻率域中不同對焦平面完整的光場給計算出來。最後只需要將還原好的四維光場頻率域資料,運用反傅利葉轉換到空間域中,便可模擬出數位變焦的功能。   有鑑於頻率域數位變焦演算法的計算十分繁複,因此在本論文最後,我們以積體電路設計該演算法的硬體加速裝置。使用TSMC 90奈米製程,晶片面積為5.078 mm2,核心尺寸則為2.935 mm2,當運作頻率為100 MHz時,功率消耗為567.5 mW。對於2048×2048的光場資料,硬體可於0.63 ns內產生一張256×256大小的數位變焦圖,與軟體版本相比,加速可達約16倍。

並列摘要


Image data captured by traditional cameras are the results of two-dimensional projection of high dimensional light field data. In other words, the data we collected are only a portion of the complete light field. If the complete light field data can be recovered through computation, we can develop more applications based on the data. In this thesis, by inserting a carefully-designed mask between lens and sensor, we transform a Nikon D700 camera to a light field camera. In the spatial domain, light rays passing through the mask can be represented by the product of the light field data and the mask function. Based on the convolution theorem, the Fourier transform of a product is the point-wise convolution of two Fourier transforms. As a result, we can use a pinhole array mask with series of impulses to duplicate the frequency spectrum of the light filed, which means the light field spectrum is projected repeatedly on the 2D sensor, thus the 4-dimensional light field can be reconstructed. We took different slices in the frequency domain to construct light fields on different focal plane. Then we convert data back to the spatial domain by applying inverse Fourier transform thus digital refocusing results can be generated. Since the runtime of the frequency-based digital refocusing algorithm is very long, we also design a hardware accelerator for this algorithm. The hardware system is implemented with TSMC 90nm technology. Its chip area and core size are 5.078 mm2 and 2.935 mm2. And the power consumption is 567.5 mW at 100 MHz. For a 2048×2048 light field data set, when operating the hardware can generate a 256×256 refocused image with 0.63 ns. When compared to the software version, the achieved speed up can be as high as 16 times.

參考文獻


[1] E. H. Adelson and J. R. Bergen, "The plenoptic function and the elements of early vision," Computational models of visual processing, vol. 1, 1991.
[2] P. Moon and D. E. Spencer, "The photic field," MIT Press, 1981.
[3] M. Levoy and P. Hanrahan, "Light field rendering," in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pp. 31-42, 1996
[4] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen, "The lumigraph," in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pp. 43-54, 1996.
[5] B. Wilburn, N. Joshi, V. Vaish, M. Levoy, and M. Horowitz, "High-speed videography using a dense camera array," in Computer Vision and Pattern Recognition(CVPR), IEEE Computer Society Conference, vol. 2, pp. II-294-II-301, 2007.

延伸閱讀