利用YOLOX架構推定視線落點

視線推定在研究領域主要分為基於幾何法(Geometry-based Methods)和基於外觀法(Appearance-based Methods)。基於幾何法利用眼睛的特徵，例如 : 瞳孔、虹膜，眼角等關鍵特徵進行視線推定，但缺點是必須要穩定光源、高解析度、低雜訊的影像，在一般生活中因環境相較於單純實驗環境複雜因此難以應用。近年來因深度學習興起，基於外觀法在研究領域受到重視，基於外觀法將影像直接映射到模型上推定出視線，由於此方法利用大量的圖像進行訓練，因此在光源複雜、低解析度、高雜訊和不同頭部姿態下的情況下有更良好的穩定性。基於外觀法目前多需先偵測出單眼、雙眼或全臉區域後，再針對區域影像進行視線估計。本論文利用無錨框解耦頭 YOLOX 架構，利用其單階段偵測之特性，對視線點位置進行直接估測，並對近年來數個公開資料集進行實驗，得到了相當良好的結果。

關鍵字

深度學習；卷積神經網路；視線推定

並列摘要

In the field of gaze estimation, the previous researches can be divided into geometry-based methods and appearance-based methods. Geometry-based methods use the geometric position of eye components, such as pupil, iris, and eye corners, to achieve gaze estimation. However, the disadvantage is that it requires a stable light source, high resolution and low noise images, which is difficult to apply in daily life due to the complexity of the general environment. In recent years, due to the rise of deep learning, the appearance-based methods take the real image as input and uses a large number of images for training, to predict robust gaze position. This provides better stability in situations with complex light sources, low resolution, high noise and different head postures. Nevertheless, most of the current methods require the detection of single-eye, double-eye or full-face areas before the gaze estimation. This paper utilizes the decoupled head YOLOX to directly estimate the gaze position in its single stage architecture. A number of experiments have been conducted on several public datasets and excellent experimental results have been obtained.

並列關鍵字

Deep learning ； Convolutional neural network ； Gaze estimation

參考文獻

Google Scholar

[1] Ge, Zheng, Songtao Liu, Feng Wang, Zeming Li and Jian Sun. “YOLOX: Exceeding YOLO Series in 2021.” ArXiv abs/2107.08430 (2021)

Google Scholar

[2] Krafka, Kyle et al. “Eye Tracking for Everyone.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016): 2176-2184.

Google Scholar

[3] X. Zhang, S. Park, T. Beeler, D. Bradley, S. Tang, and O. Hilliges, “Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation,” in The European Conference on Computer Vision (ECCV), 2020.

Google Scholar

[4] Chen, J. and Ji, Q.: 3D gaze estimation with a single camera without IR illumination, Proc. ICPR, pp.1–4 (2008).

Google Scholar

主題瀏覽