Suppressing Uncertainty in Gaze Estimation

Authors: Shijing Wang, Yaping Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the gaze estimation benchmarks indicate that our proposed SUGE achieves state-of-the-art performance.
Researcher Affiliation Academia Shijing Wang, Yaping Huang* Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing Jiaotong University, China {shijingwang, yphuang}@bjtu.edu.cn
Pseudocode Yes Algorithm 1: SUGE Input: two encoder and fully connected parameters (E(1), f (1)) and (E(2), f (2)), training data (X, Y ) Param.: small constant for denominator ϵ, number of neighbors K, confidence threshold τ
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets Yes The experiments utilize four widely used gaze estimation datasets: Eye Diap (Funes Mora, Monay, and Odobez 2014), MPIIFace Gaze (Zhang et al. 2017b), Gaze360 (Kellnhofer et al. 2019), and ETH-XGaze (Zhang et al. 2020b) (solely utilized for pretraining the Gaze TR model).
Dataset Splits Yes For a fair comparison, the data partitioning and preprocessing techniques for these datasets are maintained consistently with prior studies, as outlined by Cheng et al. (Cheng et al. 2021).
Hardware Specification No The paper does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No We directly adopt two representative state-of-the-art (SOTA) methods namely Gaze360 (Kellnhofer et al. 2019) and Gaze TR (Cheng and Lu 2022) implemented by Cheng et al. (Cheng et al. 2021) as baselines in our subsequent experiments. We utilize the same network architecture and corresponding parameter settings as these methods.
Experiment Setup Yes Regarding our method, we set ϵ to 1 to prevent excessive Tuple MD. Additionally, for the K nearest neighbor algorithm, we choose the KD tree with K = 4. Moreover, we set the warm-up epochs to 10, and the thresholds τ for truncating label and image confidences are both set to 0.5.