Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Generalized Label Shift Perspective for Cross-Domain Gaze Estimation
Authors: Hao-Ran Yang, Xiaohui Chen, Chuan-Xian Ren
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on standard CDGE tasks with different backbone models validate the superior generalization capability across domain and applicability on various models of proposed method. |
| Researcher Affiliation | Academia | Hao-Ran Yang Sun Yat-Sen University Guangzhou, China EMAIL Xiaohui Chen Sun Yat-Sen University Guangzhou, China EMAIL Chuan-Xian Ren Sun Yat-Sen University Guangzhou, China EMAIL |
| Pseudocode | Yes | Algorithm 1: Optimization of GLSGE |
| Open Source Code | No | As our work is about a general framework for CDGE problems, the reproducibility can be guaranteed by the algorithm description and implementation details provided in Sec. 4 and Appendix A. |
| Open Datasets | Yes | We conduct experiments on four standard CDGE datasets: ETH-XGaze (DE) [47],Gaze360 (DG) [13],MPIIFace Gaze (DM) [49] and Eye Diap (DD) [8]. |
| Dataset Splits | Yes | During cross-domain learning, we use 10% of the unlabeled target domain images for training and another 10% for validation, with the remaining 80% used for testing. It means that 4500 images in DM and 1667 images in DD are used for training in each task. |
| Hardware Specification | Yes | An NVIDIA RTX 4080 GPU is used for the experiments. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'cosine annealing scheduler' but does not provide specific version numbers for any software libraries or frameworks. The prompt explicitly requires specific version numbers for ancillary software. |
| Experiment Setup | Yes | We use the Adam optimizer with the learning rate of 3e 5 and a cosine annealing scheduler to decrease the learning rate in the training process. The batch size is set to be 100. As the domain shift is distinct at the beginning, we alternately correct the label shift and the conditional shift to produce better pseudo label. The confidence that decides the truncated area in label shift correction process is emprically set to 0.7 for all tasks. |