reproducibilityindex.ai

Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds

Authors: Yoav Wald, Gal Yona, Uri Shalit, Yair Carmon

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical observations on simulated data and the Waterbirds dataset.
Researcher Affiliation	Academia	Yoav Wald Johns Hopkins University ywald1@jhu.edu Gal Yona Weizmann Institute of Science Uri Shalit Technion Yair Carmon Tel Aviv University
Pseudocode	Yes	Algorithm 1 Two Phase Learning of Overparameterized Invariant Classiﬁers
Open Source Code	No	The paper mentions using the 'Domainbed package' for some methods, but does not explicitly state that the code for their own proposed methodology (Algorithm 1) or simulations is open-sourced or provide a link.
Open Datasets	Yes	We evaluate Algorithm 1 on the Waterbirds dataset (Sagawa et al., 2020a)
Dataset Splits	No	The paper mentions splitting data for Algorithm 1 ('evenly split the data from each environment into the sets Strain_e') and refers to previous work for Waterbirds dataset setup, but does not explicitly provide specific train/validation/test dataset splits (percentages or counts) within the accessible text to reproduce the overall experiment.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions 'Domainbed package' for implementation and 'scikit-learn' in references, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We further ﬁx rc = 1 and rc = 2, while N1 = 800 and N2 = 100. We then take growing values of d, while adjusting σ so that (rc/σ)2 / d/N. For each value of d we train linear models... We repeat this for 15 random seeds... We compare both the test error and the test FNR gap when using either λ = 0 (no regularization) or λ = 5.