reproducibilityindex.ai

Distilling Model Failures as Directions in Latent Space

Authors: Saachi Jain, Hannah Lawrence, Ankur Moitra, Aleksander Madry

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that this framework allows us to discover and automatically caption challenging subpopulations within the training dataset. Moreover, by combining our framework with off-the-shelf diffusion models, we can generate images that are especially challenging for the analyzed model, and thus can be used to perform synthetic data augmentation that helps remedy the model s failure modes.
Researcher Affiliation	Academia	Saachi Jain, Hannah Lawrence, Ankur Moitra & Aleksander M adry Massachusetts Institute of Technology, Cambridge, USA
Pseudocode	No	The paper describes the method in text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/Madry Lab/failure-directions.
Open Datasets	Yes	Using our framework, we can automatically identify and intervene on hard subpopulations in image datasets such as CIFAR-10, Image Net, and Chest X-ray14.
Dataset Splits	Yes	Speciﬁcally, we train the model with 20% of the original training split, and reserve 20% for the validation set and 60% as extra data for the subset intervention.
Hardware Specification	Yes	We train our models using NVIDIA V100 gpus.
Software Dependencies	Yes	Using a off-the-shelf stable diffusion (Rombach et al., 2022) model, we generate 100 images per class using the corresponding negative SVM caption (e.g., a photo of a white cat on the grass ) as the prompt. After adding these images to the training set, we retrain the last layer of the original model. Fine-tuning the model on these synthetic images improves accuracy on the hard subpopulation deﬁned according to similarity in CLIP space to the negative caption compared to using generic images generated from the reference caption (Figure 9a). The checkpoint ( sd-v1-4.ckpt ) can be found here.
Experiment Setup	Yes	We use the following hyperparameters. Parameter Value Batch Size 512 Epochs 30 Peak LR 0.02 Momentum 0.9 Weight Decay 5 10 4 Peak Epoch 2