Is Importance Weighting Incompatible with Interpolating Classifiers?
Authors: Ke Alexander Wang, Niladri Shekhar Chatterji, Saminul Haque, Tatsunori Hashimoto
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate the practical value of our analysis with neural network experiments on a subpopulation shift and a label shift dataset. |
| Researcher Affiliation | Academia | Department of Computer Science Stanford University {alxwang,niladri}@cs.stanford.edu,{saminulh,thashim}@stanford.edu |
| Pseudocode | No | The paper focuses on theoretical derivations and empirical results but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/KeAWang/importance-weighting-interpolating-classifiers |
| Open Datasets | Yes | We construct our label shift dataset from the full CIFAR10 dataset. ... For our subpopulation shift dataset, we use the Celeb A with spurious correlations dataset constructed by Sagawa et al. (2019). |
| Dataset Splits | Yes | We then use 80% of those examples for training and the rest for validation. |
| Hardware Specification | No | The paper describes experiments with neural networks but does not specify any particular hardware components (e.g., GPU models, CPU types, memory) used for training or evaluation. |
| Software Dependencies | No | The paper mentions 'SGD' as an optimizer but does not specify any software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | We train for 400 epochs with SGD with a batch size of 64. We use a constant 0.001 learning rate with 0.9 momentum... We use a constant 0.008 learning rate with no momentum... We train for 100 epochs with SGD with a batch size of 64. We use a constant 0.0004 learning rate with 0.9 momentum for all settings. For Celeb A, we exponentiate by 2 and for CIFAR10 we exponentiate by 3/2. For these experiments, we fixed α = 1... |