The Risks of Invariant Risk Minimization
Authors: Elan Rosenfeld, Pradeep Kumar Ravikumar, Andrej Risteski
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To corroborate our theoretical findings, we run an experiment on data drawn from our model to see at what point IRM is able to recover a generalizing predictor. We generated data precisely according to our model in the linear setting, with dc = 3, de = 6. The environmental means were drawn from a multivariate Gaussian prior; we randomly generated the invariant parameters and the parameters of the prior such that using the invariant features gave reasonable accuracy (71.9%) but the environmental features would allow for almost perfect accuracy on in-distribution test data (99.8%). |
| Researcher Affiliation | Academia | Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski Machine Learning Department Carnegie Mellon University elan@cmu.edu, pradeepr@cs.cmu.edu, aristesk@andrew.cmu.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks (e.g., labeled 'Pseudocode' or 'Algorithm X') were found in the paper. |
| Open Source Code | No | The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | No | The paper mentions generating its own synthetic data: 'We generated data precisely according to our model in the linear setting, with dc = 3, de = 6. The environmental means were drawn from a multivariate Gaussian prior...'. No concrete access information for a publicly available or open dataset was provided. |
| Dataset Splits | No | The paper describes generating synthetic data and evaluating on 'unseen environments' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We generated data precisely according to our model in the linear setting, with dc = 3, de = 6. The environmental means were drawn from a multivariate Gaussian prior; we randomly generated the invariant parameters and the parameters of the prior such that using the invariant features gave reasonable accuracy (71.9%) but the environmental features would allow for almost perfect accuracy on in-distribution test data (99.8%). We chose equal class marginals (η = 0.5). To prevent collapse, we kept the same environmental prior and found a single setting for λ and the learning rate which resulted in reasonable performance across all five runs. |