reproducibilityindex.ai

Iterated learning for emergent systematicity in VQA

Authors: Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our regularized iterated learning method can outperform baselines without iterated learning on SHAPES-Sy Ge T (SHAPES Systematic Generalization Test), a new split of the SHAPES dataset we introduce to evaluate systematic generalization, and on CLOSURE, an extension of CLEVR also designed to test systematic generalization. We demonstrate superior performance in recovering ground-truth compositional program structure with limited supervision on both SHAPES-Sy Ge T and CLEVR. In this section, we present results on SHAPES-Sy Ge T, CLEVR, and CLOSURE.
Researcher Affiliation	Academia	Ankit Vani Mila, Universit e de Montr eal Max Schwarzer Mila, Universit e de Montr eal Yuchen Lu Mila, Universit e de Montr eal Eeshan Dhekane Mila, Universit e de Montr eal Aaron Courville Mila, Universit e de Montr eal, CIFAR Fellow
Pseudocode	Yes	We present the full IL algorithm described in Section 3.2 in Algorithm 1.
Open Source Code	No	The paper provides a link for the SHAPES-Sy Ge T dataset ('https://github.com/ankitkv/SHAPES-Sy Ge T'), but it does not contain an explicit statement or link for the open-source code implementing the described methodology.
Open Datasets	Yes	To demonstrate our method, we introduce a lightweight benchmark for systematic generalization research based on the popular SHAPES dataset (Andreas et al., 2016) called SHAPES-Sy Ge T (SHAPES Systematic Generalization Test). Our experiments on SHAPES-Sy Ge T, CLEVR (Johnson et al., 2017a), and CLOSURE (Bahdanau et al., 2019a)... SHAPES-Sy Ge T can be downloaded from: https://github.com/ankitkv/SHAPES-Sy Ge T.
Dataset Splits	Yes	To evaluate the in-distribution and out-of-distribution generalization performance of our models, we prepare the SHAPES-Sy Ge T training set with only a subset of the questions under each train template and use the rest as an in-distribution validation set (Val-IID). Questions belonging to the evaluation templates are used as an out-of-distribution validation set (Val-OOD). Table 4: Split of questions in the SHAPES and SHAPES-Sy Ge T datasets. (b) SHAPES-Sy Ge T dataset. Split: Train Total questions: 7560 Unique questions: 135; Split: Val-IID Total questions: 1080 Unique questions: 135; Split: Val-OOD Total questions: 6976 Unique questions: 109
Hardware Specification	Yes	It takes an execution engine over a day to reach 95% task accuracy on the CLEVR validation set on a Nvidia RTX-8000 GPU, even when trained with ground-truth programs without a program generator.
Software Dependencies	No	The paper mentions 'Optimizer Adam (Kingma & Ba, 2015)' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions).
Experiment Setup	Yes	Table 3 details the hyperparameters used in our experiments in Section 5 for both SHAPES-Sy Ge T and CLEVR. Table 3: IL hyperparameters used in our experiments. Hyperparameter Value Optimizer Adam (Kingma & Ba, 2015) PG learning rate 0.0001 EE learning rate 0.0001 for CLEVR, 0.0005 for SHAPES-Sy Ge T Tensor NMN, 0.001 for Tensor-Fi LM-NMN Batch size 128 GT programs in batch 4 PG REINFORCE weight 10.0 PG GT cross-entropy weight 1.0 PG spectral normalization On for SHAPES-Sy Ge T, off for CLEVR Interacting phase length Ti 2000 for SHAPES-Sy Ge T, 5000 for CLEVR Transmitting dataset size Tt 2000 batch size = 256000 PG learning phase length Tp 2000 EE learning phase length Te For SHAPES-Sy Ge T, 250 when re-initializating EE from scratch, 200 for seeded IL; For both SHAPES-Sy Ge T and CLEVR, 50 when not resetting EE