reproducibilityindex.ai

ProTo: Program-Guided Transformer for Program-Guided Tasks

Authors: Zelin Zhao, Karan Samel, Binghong Chen, lee song

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that Pro To signiﬁcantly outperforms the previous state-of-the-art methods on GQA visual reasoning and 2D Minecraft policy learning datasets. Additionally, Pro To demonstrates better generalization to unseen, complex, and human-written programs. We evaluate Pro To on two tasks, program-guided visual reasoning and program-guided policy learning (corresponding to Figure 1 left and Figure 1 right).
Researcher Affiliation	Collaboration	Zelin Zhao The Chinese University of Hong Kong zelin@link.cuhk.edu.hk Karan Samel Georgia Institute of Technology ksamel@gatech.edu Binghong Chen Georgia Institute of Technology binghong@gatech.edu Le Song Biomap and MBZUAI dasongle@gmail.com
Pseudocode	Yes	Algorithm 1: Pro To Execution
Open Source Code	No	We will release the code and pre-trained models after publishing.
Open Datasets	Yes	We conduct experiments of program-guided visual reasoning based on the public GQA dataset [47] consisting of 22 million questions over 140 thousand images. It is divided into training, validation, and testing splits.
Dataset Splits	Yes	The GQA dataset [47] consisting of 22 million questions over 140 thousand images. It is divided into training, validation, and testing splits. On the training split, we train a transformer-based seq2seq model [87] to parse a question into a program. For validation and testing, we use this trained seq2seq model to acquire a program from a question.
Hardware Specification	No	The paper does not explicitly state specific hardware components (like GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The optimizer is BERT Adam optimizer [24] with a base learning rate 1 10 4, which is decayed by a factor of 0.5 every epoch. To alleviate over-ﬁtting, we adopt an L2 weight decay of 0.01.
Experiment Setup	Yes	We take N = 50 object features (provided by the GQA dataset) with d = 2048 dimension. The optimizer is BERT Adam optimizer [24] with a base learning rate 1 10 4, which is decayed by a factor of 0.5 every epoch. To alleviate over-ﬁtting, we adopt an L2 weight decay of 0.01. The model is trained for 20 epochs on the training split, and the best model evaluated on the validation split is submitted to the public evaluation server to get testing results.