用於虛擬實境顯示器之感知解析度提升之高效神經網路及硬體實作

雖然虛擬實境（VR）產業近年來需求遽增，發展迅速，但VR應用程式仍然無法提供充分的沉浸式體驗。這是因為VR頭戴式顯示器（HMD）不足的解析度阻礙了使用者進一步沉浸於虛擬的世界中。在本論文中，我們提出了知覺感知之時域上採樣管線（PATU）。它能夠透過增加VR HMD的感知解析度以增強其沉浸式體驗。我們使用高效且基於神經網路的方式，以及我們所提出的時域積分損失函數。透過將人眼視覺系統（HVS）的時域積分機制考慮在內，我們的網路學習到了人眼感知的過程，並將一段片段在時域上進行上採樣，進而提高其感知解析度。具體來說，我們討論將PATU與眼動追蹤技術一起部屬的潛在應用情境，這將能夠省下高達75%的運算負擔。透過推論時間分析與使用者實驗，我們的方法相較於目前最佳技術能夠提升約1.89倍的運行速度，並且能產生接受度更高的結果。由於我們方法的有效性與具能源效益之運算，我們將一體式VR HMD視為目標平台，這是因為其相較於接線式VR來說有較受限的硬體資源、緊縮的功率預算以及較差的視覺品質。我們採用了現成的卷積神經網路（CNN）加速器以模擬將我們的方法應用於一體式VR HMD的應用情境。由於我們的方法較為簡潔和規律，和目前最佳技術相比，我們能夠使用較少的硬體資源和DRAM頻寬。

關鍵字

虛擬實境；感知解析度；視線；神經網路；積體電路架構

並列摘要

Even though the Virtual Reality (VR) industry is experiencing a rapid growth with ever-expanding demands today, VR applications have yet to provide a fully immersive experience. The insufficient resolution of the VR head-mounted display (HMD) hinders the user from further immersion into the virtual world. In this work, we propose the Perception-Aware Temporal Upsampling pipeline (PATU). It enhances the immersive experience by improving the perceptual resolution of VR HMDs. We employ an efficient neural-network-based approach with the proposed temporal integration loss function. By taking the temporal integration mechanism of the Human Visual System (HVS) into account, our network learns the perception process of the human eye, and temporally upsamples a sequence that in turn improves its perceived resolution. Specifically, we discuss a possible scenario where we deploy PATU along with the eye-tracking technology, and show that it could save up to 75% of the computational load. Compared with the state-of-the-art in terms of the inference time analysis and a user experiment, our approach runs around 1.89 times faster and produces more favorable results. Due to the effectiveness and the energy-efficient operations of our method, we consider standalone VR HMDs as our target platform since they have limited hardware resources, tight power budget and inferior viaual quality than the tethered ones. We adopt an off-the-shelf CNN accelerator to simulate the application scenario where we employ our method on a standalone VR HMD. Due to the regularity and simplicity of our method, we need less hardware resource and DRAM bandwidth compared to the state-of-the-art.

並列關鍵字

Virtual Reality ； Perceptual Resolution ； Gaze ； Neural Network ； VLSI Architecture

參考文獻

H.-C. Lu, “Real-time resolution boosting on gaze-contingent displays: Algorithm and hardware architecture,” Master’s thesis, GIEE, National Taiwan University, Jan 2020.

Google Scholar

C. A. Curcio, K. R. Sloan, R. E. Kalina, and A. E. Hendrickson, “Human photoreceptor topography,” Journal of Comparative Neurology, pp. 497–523, 1990.

Google Scholar

K. Kim, M. Billinghurst, G. Bruder, H. B.-L. Duh, and G. F. Welch, “Revisiting trends in augmented reality research: A review of the 2nd decade of ISMAR (2008–2017),” IEEE Transactions on Visualization and Computer Graphics, pp. 2947–2962, 2018.

Google Scholar

P. Didyk, E. Eisemann, T. Ritschel, K. Myszkowski, and H.-P. Seidel, “Apparent display resolution enhancement for moving images,” ACM Transactions on Graphics (TOG), vol. 19, no. 4, 2010.

Google Scholar

K. Templin, P. Didyk, T. Ritschel, E. Eisemann, K. Myszkowski, and H.-P. Seidel, “Apparent resolution enhancement for animations,” in Spring Conference on Computer Graphics, 2011.

Google Scholar

國際替代計量

用於虛擬實境顯示器之感知解析度提升之高效神經網路及硬體實作

全文下載

主題瀏覽