reproducibilityindex.ai

Out-of-Distribution Generalization via Risk Extrapolation (REx)

Authors: David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, Aaron Courville

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate REx and compare with IRM on a range of tasks requiring OOD generalization. REx provides generalization beneﬁts and outperforms IRM on a wide range of tasks, including: i) variants of the Colored MNIST (CMNIST) dataset (Arjovsky et al., 2019) with extra covariate shift, ii) continuous control tasks with partial observability and spurious features, iii) domain generalization tasks from the Domain Bed suite (Gulrajani & Lopez-Paz, 2020).
Researcher Affiliation	Collaboration	1Mila 2University of Montreal 3Vector 4University of Toronto 5Mc Gill University 6Facebook AI Research. Correspondence to: <david.scott.krueger@gmail.com>.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not explicitly state that source code is available or provide a link to a repository for the described methodology.
Open Datasets	Yes	We evaluate REx and compare with IRM on a range of tasks requiring OOD generalization. REx provides generalization beneﬁts and outperforms IRM on a wide range of tasks, including: i) variants of the Colored MNIST (CMNIST) dataset (Arjovsky et al., 2019) with extra covariate shift, ii) continuous control tasks with partial observability and spurious features, iii) domain generalization tasks from the Domain Bed suite (Gulrajani & Lopez-Paz, 2020).
Dataset Splits	No	The paper mentions 'using the most commonly used training-domain validation set method for model selection' and 'average over 3 different train/valid splits' in Section 4.3, but it does not provide specific percentages, sample counts, or explicit definitions of these splits (e.g., '80/10/10 split') needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	No	The paper states, 'We use the exact same hyperparameters as Arjovsky et al. (2019)' and 'using hyperparameters tuned on cartpole_swingup', but does not provide the specific hyperparameter values or detailed training configurations within its main text, deferring to external sources for these details.