reproducibilityindex.ai

Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization

Authors: Alexandre Rame, Corentin Dancette, Matthieu Cord

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the effectiveness of Fishr for out-of-distribution generalization. Notably, Fishr improves the state of the art on the Domain Bed benchmark and performs consistently better than Empirical Risk Minimization. Our code is available at https: //github.com/alexrame/fishr.
Researcher Affiliation	Collaboration	1Sorbonne Universit e, CNRS, LIP6, Paris, France 2Valeo.ai. Correspondence to: Alexandre Ram e <alexandre.rame@sorbonneuniversite.fr>.
Pseudocode	Yes	Algorithm 1 Training procedure for Fishr on Domain Bed.
Open Source Code	Yes	Our code is available at https: //github.com/alexrame/fishr.
Open Datasets	Yes	We conduct extensive experiments on the Domain Bed benchmark (Gulrajani & Lopez-Paz, 2021). In addition to the synthetic Colored MNIST (Arjovsky et al., 2019) and Rotated MNIST (Ghifary et al., 2015), the multi-domain image classification datasets are the real VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), Terra Incognita (Beery et al., 2018) and Domain Net (Peng et al., 2019).
Dataset Splits	Yes	The data from each domain is split into 80% (used as training and testing) and 20% (used as validation for hyperparameter selection) splits.
Hardware Specification	Yes	For example, on PACS (7 classes and \|ω\| = 14, 343) with a Res Net-50 and batch size 32, Fishr induces an overhead in memory of +0.2% and in training time of +2.7% (with a Tesla V100) compared to ERM
Software Dependencies	No	The paper mentions using PyTorch and the Back PACK package but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	To limit access to test domain, the framework enforces that all methods are trained with only 20 different configurations of hyperparameters and for the same number of steps. Results are averaged over three trials. This experimental setup is further described in Appendix D.1.