reproducibilityindex.ai

TransGOP: Transformer-Based Gaze Object Prediction

Authors: Binglu Wang, Chenxi Guo, Yang Jin, Haisheng Xia, Nian Liu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the GOOSynth and GOO-Real datasets demonstrate that our Trans GOP achieves state-of-the-art performance on all tracks, i.e., object detection, gaze estimation, and gaze object prediction.
Researcher Affiliation	Academia	1Xi an University of Architecture and Technology 2Beijing Institute of Technology 3University of Science and Technology of China 4Mohamed bin Zayed University of Artificial Intelligence {wbl921129, guochenxix, jin91999}@gmail.com, hsxia@ustc.edu.cn, liunian228@gmail.com
Pseudocode	No	The paper provides architectural diagrams and descriptions of its components but does not include formal pseudocode or algorithm blocks.
Open Source Code	Yes	Our code will be available at https://github.com/chenxi Guo/Trans GOP.git.
Open Datasets	Yes	All experiments were conducted on GOO-Synth and GOO-Real datasets (Tomas et al. 2021). ...Tomas et al. (Tomas et al. 2021) ...introduced the first dataset, the GOO dataset...
Dataset Splits	No	The paper mentions training on GOO-Synth and GOO-Real datasets and discusses evaluation metrics, but it does not explicitly provide specific percentages or counts for training, validation, or test splits. It implies standard splits are used for the cited datasets without detailing them.
Hardware Specification	Yes	All experiments are implemented based on the Py Torch and one Ge Force RTX 3090Ti GPU.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	Trans GOP is trained for 50 epochs, with an initial learning rate of 1e 4, and the learning rate is reduced by 0.94 times every 5 epochs. We use Adam W as our optimizer. For the gaze autoencoder, we set the hidden size to 256 and employ 200 decoder queries. In Eq.1, the loss weight α is 1000 and β is 10. The input size of the image is set to 224 224 and the predicted heatmap size is 64 64.