CLIP-Gaze: Towards General Gaze Estimation via Visual-Linguistic Model
Authors: Pengwei Yin, Guanzhong Zeng, Jingjing Wang, Di Xie
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the excellent performance of CLIP-Gaze over existing methods on four cross-domain evaluations. [...] Experiments Experiment Details Gaze Data Details To verify the performance of our method in the gaze estimation task, we use ETHXGaze(Zhang et al. 2020) and Gaze360(Kellnhofer et al. 2019) as training set, and test the gaze model on MPIIFace Gaze(Zhang et al. 2017) and Eye-Diap(Funes Mora, Monay, and Odobez 2014). Thus, we totally evaluate on four cross-domain task [...] Quantitative result of four cross-domain gaze estimation tasks are shown in Tab. 1. [...] Ablation Study |
| Researcher Affiliation | Industry | Pengwei Yin1,2*, Guanzhong Zeng1,2*, Jingjing Wang1,2 , Di Xie1,2 1Hikvision Research Institute, Hangzhou, China 2Zhejiang Key Laboratory of Social Security Big Data, China {yinpengwei,zengguanzhong,wangjingjing9,xiedi}@hikvision.com |
| Pseudocode | No | The paper describes methods in prose and with figures (e.g., Figure 2 and 3), but it does not contain a formal 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is provided or include a link to a code repository. |
| Open Datasets | Yes | To verify the performance of our method in the gaze estimation task, we use ETHXGaze(Zhang et al. 2020) and Gaze360(Kellnhofer et al. 2019) as training set, and test the gaze model on MPIIFace Gaze(Zhang et al. 2017) and Eye-Diap(Funes Mora, Monay, and Odobez 2014). |
| Dataset Splits | No | The paper mentions 'training set' and 'test' sets, but does not explicitly provide details about a validation set or specific percentages/counts for train/test/validation splits. |
| Hardware Specification | Yes | We conduct the experiments on a single Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions models like ResNet-18 and uses terms common in deep learning frameworks (e.g., MLP, FC layer), but it does not specify software dependencies with version numbers like 'PyTorch 1.9' or 'Python 3.8'. |
| Experiment Setup | Yes | We set the batch size to 128 and train the model for 30 epochs on ETH-XGaze and Gaze360. [...] λ1, λ2, λ3 are hyper-parameters, and we empirically set λ1 = λ2 = λ3 = 1.0. |