reproducibilityindex.ai

Environment Inference for Invariant Learning

Authors: Elliot Creager, Joern-Henrik Jacobsen, Richard Zemel

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that EIIL outperforms invariant learning methods on the CMNIST benchmark without using environment labels, and significantly outperforms ERM on worst-group performance in the Waterbirds and Civil Comments datasets. Finally, we establish connections between EIIL and algorithmic fairness, which enables EIIL to improve accuracy and calibration in a fair prediction problem. ... 5. Experiments
Researcher Affiliation	Academia	Elliot Creager 1 2 J orn-Henrik Jacobsen 1 2 Richard Zemel 1 2 1University of Toronto 2Vector Institute. Correspondence to: Elliot Creager <creager@cs.toronto.edu>.
Pseudocode	Yes	See Algorithm 1 in Appendix A for pseudocode.
Open Source Code	Yes	See https://github.com/ecreager/eiil for code.
Open Datasets	Yes	CMNIST is a noisy digit recognition task... (Arjovsky et al., 2019).", "Waterbirds dataset (Sagawa et al., 2020).", "UCI Adult dataset, which comprises 48, 842 census records... https://archive.ics.uci.edu/ml/datasets/adult", "Civil Comments-WILDS... follow the procedure and data splits of Koh et al. (2021)"
Dataset Splits	Yes	Where possible, we reuse effective hyperparameters for IRM and Group DRO found by previous authors. Because these works allowed limited validation samples for hyperparameter tuning (all baseline methods benefit fairly from this strategy)..." and "worst-group validation accuracy is used to tune hyperparameters for all methods.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'Distil BERT embeddings' and 'Huggingface’s transformers' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper states 'Where possible, we reuse effective hyperparameters for IRM and Group DRO found by previous authors.' and refers to 'Appendix E for further discussion' regarding model selection, but does not provide specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings) in the main text.