Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers

Authors: Yujia Bao, Shiyu Chang, Regina Barzilay

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results demonstrate that our algorithm is able to learn robust classifiers (outperforms IRM by 23.85% on synthetic environments and 12.41% on natural environments). We evaluate our method on both text classification and image classification.
Researcher Affiliation Collaboration Yujia Bao 1 Shiyu Chang 2 Regina Barzilay 3 1MIT CSAIL 2MIT-IBM Watson AI Lab 3MIT CSAIL.
Pseudocode No The paper describes the algorithm in Section 3.1 using numbered stages and text, but it does not provide a formally labeled pseudocode block or algorithm figure.
Open Source Code Yes Our code and data are available at https://github.com/YujiaBao/ Predict-then-Interpolate.
Open Datasets Yes For MNIST, we adopt Arjovsky et al. (2019) s approach for generating spurious correlation and extend it to a more challenging multi-class problem. For Beer Review, we consider three aspect-level sentiment classification tasks: look, aroma and palate (Lei et al., 2016; Bao et al., 2018). We study two datasets: Celeb A (Liu et al., 2015b) where the attributes are annotated by human and ASK2ME (Bao et al., 2019) where the attributes are automatically generated by rules.
Dataset Splits Yes For both datasets, we consider two different validation settings and report their performance separately: 1) sampling the validation set from the training environment; 2) sampling the validation set from the testing environment.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments.
Software Dependencies No The paper states 'Our implementation builds on PyTorch (Paszke et al., 2019) and Adam optimizer (Kingma & Ba, 2014).', but it does not specify version numbers for PyTorch or any other software libraries.
Experiment Setup Yes For all models, we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 1e-3 and a batch size of 128. We train the models for 100 epochs.