RGBD Based Gaze Estimation via Multi-Task CNN
Authors: Dongze Lian, Ziheng Zhang, Weixin Luo, Lina Hu, Minye Wu, Zechao Li, Jingyi Yu, Shenghua Gao2488-2495
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments demonstrate that our method outperforms existing methods by a large margin on both our dataset and the EYEDIAP dataset. We evaluate our method on both our RGBD gaze dataset and EYEDIAP. The experimental results of different methods are listed in Table 2 and Table 3. Ablation Studies |
| Researcher Affiliation | Academia | 1Shanghai Tech University, 2Nanjing University of Science and Technology |
| Pseudocode | No | The paper describes network architectures with diagrams but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states, 'Our dataset will facilitate the study of data-driven-based approaches for gaze tracking, and we will release our dataset to the community in the future.' but makes no explicit mention or provides any link for the release of source code for the methodology described. |
| Open Datasets | Yes | Mora et al. (Mora, Monay, and Odobez 2014) have built a dataset based on RGBD by using a Kinect camera, but the participants were too few (only 16 participants). Krafka et al. (Krafka et al. 2016) showed that more participants would boost the performance of person-independent gaze tracking. Thus we build a large RGBD gaze dataset with 218 participants and 165,231 images. Our dataset will facilitate the study of data-driven-based approaches for gaze tracking, and we will release our dataset to the community in the future. We evaluate our method on both our RGBD gaze dataset and EYEDIAP. We follow the same strategy as (Zhang et al. 2017) to choose frame images and gaze points. After that, we divide the 14 participants into 5 groups and perform cross-validation. |
| Dataset Splits | Yes | We further use the images corresponding to 159 participants (119,318 RGB/depth image pairs) as training data and use the data corresponding to the remaining 59 participants as test data (45,913 RGB/depth image pairs). For gaze direction estimation on the EYEDIAP dataset...we divide the 14 participants into 5 groups and perform cross-validation. |
| Hardware Specification | Yes | We use 8 NVIDIA Tesla k40m GPUs to train our network. |
| Software Dependencies | No | The paper states, 'We implement our method with the Py Torch (Paszke et al. 2017) framework.' While PyTorch is mentioned, a specific version number is not provided, nor are versions for other key software components. |
| Experiment Setup | Yes | The batch size for all of our experiments is 100 for training and 200 for testing. We use 8 NVIDIA Tesla k40m GPUs to train our network. Stochastic Gradient Descent (SGD) optimization algorithm is adopted to train our network. We first separately pretrain the GAN and gaze estimation networks, and then we finetune the overall network to get optimal estimation performance with multi-task learning. |