Is Importance Weighting Incompatible with Interpolating Classifiers?

Authors: Ke Alexander Wang, Niladri Shekhar Chatterji, Saminul Haque, Tatsunori Hashimoto

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate the practical value of our analysis with neural network experiments on a subpopulation shift and a label shift dataset.
Researcher Affiliation Academia Department of Computer Science Stanford University {alxwang,niladri}@cs.stanford.edu,{saminulh,thashim}@stanford.edu
Pseudocode No The paper focuses on theoretical derivations and empirical results but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/KeAWang/importance-weighting-interpolating-classifiers
Open Datasets Yes We construct our label shift dataset from the full CIFAR10 dataset. ... For our subpopulation shift dataset, we use the Celeb A with spurious correlations dataset constructed by Sagawa et al. (2019).
Dataset Splits Yes We then use 80% of those examples for training and the rest for validation.
Hardware Specification No The paper describes experiments with neural networks but does not specify any particular hardware components (e.g., GPU models, CPU types, memory) used for training or evaluation.
Software Dependencies No The paper mentions 'SGD' as an optimizer but does not specify any software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes We train for 400 epochs with SGD with a batch size of 64. We use a constant 0.001 learning rate with 0.9 momentum... We use a constant 0.008 learning rate with no momentum... We train for 100 epochs with SGD with a batch size of 64. We use a constant 0.0004 learning rate with 0.9 momentum for all settings. For Celeb A, we exponentiate by 2 and for CIFAR10 we exponentiate by 3/2. For these experiments, we fixed α = 1...