透過您的圖書館登入
IP:3.138.114.132
  • 學位論文

利用YOLOX架構推定視線落點

Appearance-based Gaze Estimation Using YOLOX

指導教授 : 蘇志文
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


視線推定在研究領域主要分為基於幾何法(Geometry-based Methods)和基於外觀法(Appearance-based Methods)。基於幾何法利用眼睛的特徵,例如 : 瞳孔、虹膜,眼角等關鍵特徵進行視線推定,但缺點是必須要穩定光源、 高解析度、低雜訊的影像,在一般生活中因環境相較於單純實驗環境複雜 因此難以應用。近年來因深度學習興起,基於外觀法在研究領域受到重視,基於外觀法將影像直接映射到模型上推定出視線,由於此方法利用大量的圖像進行訓練,因此在光源複雜、低解析度、高雜訊和不同頭部姿態下的情況下有更良好的穩定性。基於外觀法目前多需先偵測出單眼、雙眼或全臉區域後,再針對區域影像進行視線估計。本論文利用無錨框解耦頭 YOLOX 架構,利用其單階段偵測之特性,對視線點位置進行直接估測,並對近年來數個公開資料集進行實驗,得到了相當良好的結果。

並列摘要


In the field of gaze estimation, the previous researches can be divided into geometry-based methods and appearance-based methods. Geometry-based methods use the geometric position of eye components, such as pupil, iris, and eye corners, to achieve gaze estimation. However, the disadvantage is that it requires a stable light source, high resolution and low noise images, which is difficult to apply in daily life due to the complexity of the general environment. In recent years, due to the rise of deep learning, the appearance-based methods take the real image as input and uses a large number of images for training, to predict robust gaze position. This provides better stability in situations with complex light sources, low resolution, high noise and different head postures. Nevertheless, most of the current methods require the detection of single-eye, double-eye or full-face areas before the gaze estimation. This paper utilizes the decoupled head YOLOX to directly estimate the gaze position in its single stage architecture. A number of experiments have been conducted on several public datasets and excellent experimental results have been obtained.

參考文獻


參考文獻
[1] Ge, Zheng, Songtao Liu, Feng Wang, Zeming Li and Jian Sun. “YOLOX: Exceeding YOLO Series in 2021.” ArXiv abs/2107.08430 (2021)
[2] Krafka, Kyle et al. “Eye Tracking for Everyone.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016): 2176-2184.
[3] X. Zhang, S. Park, T. Beeler, D. Bradley, S. Tang, and O. Hilliges, “Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation,” in The European Conference on Computer Vision (ECCV), 2020.
[4] Chen, J. and Ji, Q.: 3D gaze estimation with a single camera without IR illumination, Proc. ICPR, pp.1–4 (2008).

延伸閱讀