Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds

Authors: Yoav Wald, Gal Yona, Uri Shalit, Yair Carmon

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical observations on simulated data and the Waterbirds dataset.
Researcher Affiliation Academia Yoav Wald Johns Hopkins University ywald1@jhu.edu Gal Yona Weizmann Institute of Science Uri Shalit Technion Yair Carmon Tel Aviv University
Pseudocode Yes Algorithm 1 Two Phase Learning of Overparameterized Invariant Classifiers
Open Source Code No The paper mentions using the 'Domainbed package' for some methods, but does not explicitly state that the code for their own proposed methodology (Algorithm 1) or simulations is open-sourced or provide a link.
Open Datasets Yes We evaluate Algorithm 1 on the Waterbirds dataset (Sagawa et al., 2020a)
Dataset Splits No The paper mentions splitting data for Algorithm 1 ('evenly split the data from each environment into the sets Strain_e') and refers to previous work for Waterbirds dataset setup, but does not explicitly provide specific train/validation/test dataset splits (percentages or counts) within the accessible text to reproduce the overall experiment.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper mentions 'Domainbed package' for implementation and 'scikit-learn' in references, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We further fix rc = 1 and rc = 2, while N1 = 800 and N2 = 100. We then take growing values of d, while adjusting σ so that (rc/σ)2 / d/N. For each value of d we train linear models... We repeat this for 15 random seeds... We compare both the test error and the test FNR gap when using either λ = 0 (no regularization) or λ = 5.