reproducibilityindex.ai

Systematic Generalization: What Is Required and Can It Be Learned?

Authors: Dzmitry Bahdanau*, Shikhar Murty*, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare both types of models in how much they lend themselves to a particular form of systematic generalization. Using a synthetic VQA test, we evaluate which models are capable of reasoning about all possible object pairs after training on only a small subset of them. Our ﬁndings show that the generalization of modular models is much more systematic and that it is highly sensitive to the module layout, i.e. to how exactly the modules are connected. We furthermore investigate if modular models that generalize well could be made more end-to-end by learning their layout and parametrization.
Researcher Affiliation	Collaboration	Dzmitry Bahdanau Mila, Universit e de Montr eal Adept Mind Scholar Element AI
Pseudocode	Yes	Algorithm 1 Pseudocode for creating SQOOP
Open Source Code	Yes	The code for all experiments is available online1. 1https://github.com/rizar/systematic-generalization-sqoop
Open Datasets	No	The paper describes the synthetic SQOOP dataset and provides pseudocode for its generation, but it does not provide concrete access information (e.g., a link or repository) to a pre-generated or publicly hosted version of the dataset files themselves.
Dataset Splits	Yes	Our training sets contain 1 million examples, so for a dataset with #rhs/lhs = k we generate approximately 106/(36 4 k) different images per unique question. ... We continuously monitored validation set performance of all models during training, selected the best one and reported its performance on the test set.
Hardware Specification	Yes	We also thank Nvidia for donating NVIDIA DGX-1 used for this research.
Software Dependencies	No	The paper mentions using the Adam optimizer with specific hyperparameters but does not list any software libraries or frameworks with their specific version numbers.
Experiment Setup	Yes	All models share the same stem architecture which consists of 6 layers of convolution (8 for Relation Networks), batch normalization and max pooling. The input to the stem is a 64 64 3 image, and the feature dimension used throughout the stem is 64. ... In all our experiments we used the Adam optimizer (Kingma & Ba, 2015) with hyperparameters α = 0.0001, β1 = 0.9, β2 = 0.999, ϵ = 10 10. ... The number of training iterations for each model was selected in preliminary investigations based on our observations of how long it takes for different models to converge. This information, as well as other training details, can be found in Table 3.