Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Invariant Risk Minimization Is A Total Variation Model

Authors: Zhao-Rong Lai, Weiwen Wang

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the proposed framework achieves competitive performance in several benchmark machine learning scenarios. (Abstract, Section 4. Experiments, Section 4.1. Simulation Study, Section 4.2. Real-world Experiments)
Researcher Affiliation	Academia	1Department of Mathematics, College of Information Science and Technology, Jinan University, Guangzhou, China. Correspondence to: Weiwen Wang <EMAIL>.
Pseudocode	No	The paper describes update formulas for optimization (e.g., 'Φ(k+1) = Φ(k) η Φg(Φ(k))') in Appendix B.1 but does not present them in a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code	Yes	Code is available at https://github.com/laizhr/IRM-TV.
Open Datasets	Yes	We use the House Prices data set1 to verify the TV-ℓ1-based models in a regression task... 1https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data (Section 4.2) This data set contains face images of celebrities (Liu et al., 2015). (Section 4.2) The Landcover data set records time series and the corresponding land cover types from the satellite data (Gislason et al., 2006; Russwurm et al., 2020; Xie et al., 2021). (Section 4.2) In this task we use the Adult data set2 to predict if the income of an individual exceeds $50K/yr based on the census data... 2https://archive.ics.uci.edu/dataset/2/adult (Section 4.2)
Dataset Splits	Yes	In training sample generation, ps(t) is fixed as p s for t [0, 0.5) and as p+ s for t [0.5, 1]... (Section 4.1) Samples with built year in period [1900, 1950] are used for training and those with built year in period (1950, 2000] are used for test. (Section 4.2, House Price Prediction) We randomly choose two thirds of data from the subgroups Black Male and Non-Black Female for training, and then verify models across all four subgroups with the rest data. (Section 4.2, Adult Income Prediction)
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions 'Pytorch' (Appendix B.1) but does not specify its version number or any other software dependencies with their specific versions.
Experiment Setup	Yes	More implementing details can be found in the code link, such as the learning rate, the number of training epochs, etc. (Appendix B.1) We apply min-batch subgradients with batch size 1024 in Landcover, and full-batch subgradients in the other data sets. (Appendix B.1) Table B2 provides Pytorch-style architectures of the invariant feature extractor Φ and the environment inferring measure ρ, detailing specific layers and activation functions for each dataset. (Appendix B.3)