透過您的圖書館登入
IP:18.217.108.11
  • 學位論文

應用於多視角影像顯示系統虛擬視角影像生成技術之演算法及硬體架構實作

Algorithm and Architecture Design of Virtual View Synthesis in High Definition Free-Viewpoint TV System

指導教授 : 陳良基

摘要


在現今的電子顯示器技術產業中,多視角影像(Multi-view Video)技術及其應用的發展帶來了極具影響力的衝擊。針對於此種特殊的影像顯示技術,多個不同視角的影像能夠分別呈現在不同觀看位置的視者眼裡,數種應用中包含了3D電視(3D-TV)、自由視角顯示器(Free-viewpoint TV)都正在蓬勃發展中,這也正反映著未來的使用者對於顯示器呈現多過於2D影像的需求。對於自由視角顯示器的實際實作,顯示器必須具備一次平行處理並顯示多個影像的高處理效率,以及支援任意視角的自由影像功能,顯然的以往的多視角影像的形式由於視角有限並無法滿足需求。因此,虛擬視角影像生成技術(Virtual View Synthesis)是實作出自由視角顯示技術的重要演算法,它能夠透過有限的視角影像,並運用3D空間上的相對位置運算,建立出3D模型,進而生成使用者所需要的任意視角影像,然而,這一項技術有鑒於其複雜的演算法目前多由軟體開發,處理速度遠不及於顯示器播放影像所需的的速度。為了實作出大畫面高解析度的自由視角顯示器,針對於虛擬視角影像生成技術的演算法改進及硬體架構設計,是本篇論文的重點。 在本篇論文中,首先提出了Single Iteration的虛擬影像生成演算法,用來提供使用者觀看螢幕實能夠平順的切換視角。Single Iteration的概念在改進演算法上,不但大幅減少了複雜的矩陣運算次數,以及多餘不必要的影像處理步驟。依據實驗後的數據統計,Single Iteration演算法能夠減少86%不必要的3D點對點轉換步驟(pixel warping),同時還改進了舊的演算法中容易造成生成影像上的瑕疵部分。第二部分,在生成自由視角的技術部分,針對於硬體架構的實作中最棘手的部分,在於任意視角的3D點對點轉換上的同軸幾何限制性(epipolar geometry constraint),此項性質中的不規律性造成了硬體架構設計上的困難。同時為了硬體實作上的方便性,將原始的採用frame-based的演算法改進為以一個8*8的方塊為單位的處理架構。而進而提出的幾何矯正與反矯正的概念(Rectification and Un-rectification)更能有效的解決同軸幾何限制性所造成的不規則性質,在論文中實作出多種可選擇性的方塊形狀,搭配幾何矯正與反矯正的流程,即能夠實作出滿足任意視角的影像生成的硬體架構,加上一些生成影像上的瑕疵補償,使得生成影像的品質並不會劣於原始演算法的軟體實作。第三部分則是對於點對點位置轉換所需的矩陣運算上提出低成本(low-cost)的硬體架構,能夠省下近95.9%不必要的硬體面積。更多的架構設計細節部分,也會在論文中的硬體章節作詳細的介紹,例如幾何矯正反矯正的實際硬體架構,輸入輸出的暫存緩衝空間設計等等。 基於上述的貢獻,針對虛擬視角影像生成技術的演算馬與硬體架構設計,所達到的高處理效率足以支援一台規格為Quad-HD 4096*2160解析度,每秒24幅畫面同時提供9處任意視角的自由視角顯示器,而視角的位置當然沒有限制,同時,這一項設計也包含在今年實驗室所製作的自由視角顯像技術晶片(FTV-system Setup Box)中,扮演舉足輕重的一部分。

並列摘要


Multi-view video and its applications are the epochal impacts to the history of TV display system for bringing the viewers a three-dimensional and real perceptual experience by transmitting different video sequences simultaneously on the display. By special multi-view displays, different views are projected to different eyes of viewers. As the display technology growing, more and more related applications, like 3D-TV and free-viewpoint TV (FTV) are closer to be realized. Further, the requirement of high quality video is emerged in these years. Especially for the reality achievement of FTV system, displayed images are demanded to be synchronized to the viewpoint change of viewers with any possible locations and viewing angles. Multi-view video format is not enough to support free viewpoint sequences since its samples of spatial dimension is finite. For this purpose, the virtual view synthesis algorithm is developed for rendering images seen from any virtual viewpoints by the finite source of images seen from some fixed viewpoints only. In so doing, FTV system is therefore able to be fulfilled and the virtual reality is established for viewers in the future. In this thesis, Single Iteration View Synthesis (SIVS) algorithm is firstly proposed. To support the smooth and free view-point switching in realizing the virtual view synthesis technique for FTV, both matrix-based depth image based rendering and the complex virtual view interpolation schemes are required. In order to reduce the high computation complexity and avoid the iterative processing scheduling in conventional view interpolation flows, a single-iterative view synthesis algorithm is proposed. By the usage of the proposed algorithm, the redundant warping operations are reduced by 86\%. In addition, based on the proposed artifact detecting and removing algorithm, artifacts due to imperfect depth maps can be detected and eliminated at the same time. Therefore, no additional post-processing or iteration is required and the single iteration processing is achieved. Second, another serious problem arose from free viewpoint supporting is the complicated epipolar geometry constraint for non-restricted camera setting. By the concept of rectification and un-rectification flow, the hardware design of free viewpoint virtual view synthesis is therefore feasible. The convenience of supporting variant types of sloped block loading is also bringed on and the core engine of parallel line-based warping becomes easier to develop. Furthermore, Line drift compensation and deviation detection are implemented to reduce the line-shaped holes in order to resolve the artifact of synthesized image caused by rectification and un-rectification flow. 87.8\% of artifact can be recovered by those two compensation schemes and only 0.54\% of image pixels become un-warped by the proposed virtual view synthesis hardware design in experimental analysis. A low-area architecture is also proposed by employing homographic transform concept and the linear-interpolated approximation algorithm, the large area requirement due to the synthesis parameters is resolved. In addition, redundant information for fraction bits of parameters is further reduced by precision fitting analysis. 95.9\% of area for matrix parameter rendering stage and 69.5\% for vector transform stage are reduced with only 0.0059 dB overhead of PSNR performance. More details are introduced of hardware architecture in the section of hardware designing, such like pipelining scheme and parallel mode extension so as to increase the hardware performance and accelerate the processing throughput. In order to resolve the data bandwidth problem, input and output buffering techniques are proposed and adapted to rectification and un-rectification flow. Based on the proposed algorithm and architecture, a "High Definition FTV System Virtual View Synthesis Engine" with the specification of Quad-HD 4096x2160 sequence with 24 fps for 9 simultaneous different viewpoints and no restriction about camera arrangement and rotation. And it accompanies with the chip design "FTV-system setup box" which is introduced in the end of this thesis.

參考文獻


[2] T. Fujii, K. Mori, K. Takeda, K. Mase, M. Tanimoto, and Y. Suenaga, “Multipoint measuring system for video and sound - 100-camera and microphone system,” in Multimedia and Expo, 2006 IEEE International Conference on, 9-12 2006, pp.437 –440.
[3] M. Tanimoto, “Ftv (free viewpoint television) creating ray-based image engineering,” in Image Processing, 2005. ICIP 2005. IEEE International Conference on, 11-14 2005, vol. 2, pp. II – 25–8.
[4] M. Tanimoto, “Ftv (free viewpoint television) for 3d scene reproduction and creation,” in Computer Vision and Pattern Recognition Workshop, 2006. CVPRW ’06. Conference on, 17-22 2006, pp. 172 – 172.
[5] “Introduction to multi-view video coding,” in ISO/IEC JTC1/SC 29/WG11, N7328, July 2005.
[7] Iole Moccagatta, “Recent developments in video compression standards and their impact on implementation complexity: From scalable to multi-view video coding,” in Embedded Systems for Real Time Multimedia, Proceedings of the 2006 IEEE/ACM/IFIP Workshop on, oct. 2006, pp. 4 –4.

被引用紀錄


姜彥宏(2011)。利用動態規劃方法重建影像深度之系統晶片實現〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-0508201115044400

延伸閱讀