Systematic generalisation with group invariant predictions

Authors: Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform an empirical study on three synthetic datasets, showing that group invariance methods across inferred partitionings of the training set can lead to significant improvements at such test-time situations.
Researcher Affiliation Collaboration Université de Montréal, Mila, 2CIFAR Fellow, 3Microsoft Research
Pseudocode Yes Algorithm 1: Algorithm for PGI
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes COLOURED MNIST: Consider an illustrative dataset with coloured MNIST digits. COCO-ON-COLOURS: We superimpose 10 segmented COCO (Lin et al., 2014) objects on coloured backgrounds. COCO-ON-PLACES: Here we superimpose the same COCO objects on scenes from the PLACES dataset (Zhou et al., 2017).
Dataset Splits Yes The training set has 800 images per category, with nine in-distribution categories and one held-out category for anomaly detection. Validation and test sets have 100 each images per category.
Hardware Specification No The paper mentions 'computational resources provided by Compute Canada and Mila' but does not specify any particular CPU, GPU models, or detailed hardware configurations used for the experiments.
Software Dependencies No The paper mentions software like 'Adam optimiser (Kingma & Ba, 2014)' and 'LAYER NORM (Ba et al., 2016)' but does not provide specific version numbers for software dependencies such as Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup Yes Training is conducted for 30 epochs, with SGD + Momentum (0.9), using batch sizes of 512. The learning rate is cut by 10 from its initial value of 0.1 at epochs 9, 18, and 24. We train for 200 epochs with SGD + Momentum (0.9), using batch sizes of 384, with an initial learning rate of 0.1 which is cut by 10 at the 120th, 160th, 180th, and 190th epochs.