Spuriosity Didn’t Kill the Classifier: Using Invariant Predictions to Harness Spurious Features

Authors: Cian Eastwood, Shashank Singh, Andrei L Nicolicioiu, Marin Vlastelica Pogančić, Julius von Kügelgen, Bernhard Schölkopf

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate the effectiveness of SFB on real and synthetic data.
Researcher Affiliation Academia 1 Max Planck Institute for Intelligent Systems, Tübingen 2 University of Edinburgh 3 University of Cambridge
Pseudocode Yes Algorithm 1: Bias-corrected adaptation procedure. Multi-class version given by Algorithm 2.
Open Source Code Yes Code is available at: https://github.com/cianeastwood/sfb.
Open Datasets Yes We consider the Color MNIST dataset [1]. We next consider the PACS dataset [37] a 7-class image-classification dataset consisting of 4 domains: photos (P), art (A), cartoons (C) and sketches (S), with examples shown in Fig. 4. Camelyon17. Finally, in the additional experiments of App. F.2, we consider the Camelyon17 [3] dataset from the WILDS benchmark [33]: a medical dataset with histopathology images from 5 hospitals which use different staining and imaging techniques (see Fig. 4).
Dataset Splits Yes Following Jiang and Veitch [31, 6.1], we create two training domains with βe {0.95, 0.7}, one validation domain with βe = 0.6 and one test domain with βe = 0.1. For Camelyon17 [3], we follow WILDS [33] and use the first three domains for training, the fourth for validation, and the fifth for testing.
Hardware Specification No The paper mentions general training details such as using a '3-layer network' or 'Res Net-18' and 'Adam optimizer', but does not specify any particular hardware like GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions frameworks like 'Adam optimizer' and 'Res Net-18' which are models/optimizers, but it does not specify any software libraries with version numbers (e.g., PyTorch 1.9, Python 3.8).
Experiment Setup Yes For SFB, we sweep over λS in {0.01, 0.1, 1, 5, 10, 20} and λC in {0.01, 0.1, 1}. For all methods, we use a 2-hidden-layer MLP with 390 hidden units, the Adam optimizer, a learning rate of 0.0001 with cosine scheduling, and dropout with p=0.2. In addition, we use full batches (size 25000), 400 steps for ERM pre-training (which directly corresponds to the delicate penalty annealing or warm-up periods used by penalty-based methods on Color MNIST [1, 35, 15, 74]), and 600 total steps.