Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Authors: Alexandre Rame, Corentin Dancette, Matthieu Cord
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of Fishr for out-of-distribution generalization. Notably, Fishr improves the state of the art on the Domain Bed benchmark and performs consistently better than Empirical Risk Minimization. Our code is available at https: //github.com/alexrame/fishr. |
| Researcher Affiliation | Collaboration | 1Sorbonne Universit e, CNRS, LIP6, Paris, France 2Valeo.ai. Correspondence to: Alexandre Ram e <alexandre.rame@sorbonneuniversite.fr>. |
| Pseudocode | Yes | Algorithm 1 Training procedure for Fishr on Domain Bed. |
| Open Source Code | Yes | Our code is available at https: //github.com/alexrame/fishr. |
| Open Datasets | Yes | We conduct extensive experiments on the Domain Bed benchmark (Gulrajani & Lopez-Paz, 2021). In addition to the synthetic Colored MNIST (Arjovsky et al., 2019) and Rotated MNIST (Ghifary et al., 2015), the multi-domain image classification datasets are the real VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), Terra Incognita (Beery et al., 2018) and Domain Net (Peng et al., 2019). |
| Dataset Splits | Yes | The data from each domain is split into 80% (used as training and testing) and 20% (used as validation for hyperparameter selection) splits. |
| Hardware Specification | Yes | For example, on PACS (7 classes and |ω| = 14, 343) with a Res Net-50 and batch size 32, Fishr induces an overhead in memory of +0.2% and in training time of +2.7% (with a Tesla V100) compared to ERM |
| Software Dependencies | No | The paper mentions using PyTorch and the Back PACK package but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | To limit access to test domain, the framework enforces that all methods are trained with only 20 different configurations of hyperparameters and for the same number of steps. Results are averaged over three trials. This experimental setup is further described in Appendix D.1. |