Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

Authors: Saurabh Garg, Amrith Setlur, Zachary Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (e.g., BREEDs, WILDS), we demonstrate that the combined method obtains 3 8% higher accuracy than either approach independently. Finally, we theoretically analyze these techniques in a simplified model of distribution shift demonstrating scenarios under which the features produced by contrastive learning can yield a good initialization for self-training to further amplify gains and achieve optimal performance, even when either method alone would fail.
Researcher Affiliation Academia Saurabh Garg Carnegie Mellon University sgarg2@andrew.cmu.edu Amrith Setlur Carnegie Mellon University asetlur@andrew.cmu.edu Zachary C. Lipton Carnegie Mellon University zlipton@andrew.cmu.edu Sivaraman Balakrishnan Carnegie Mellon University sbalakri@andrew.cmu.edu Virginia Smith Carnegie Mellon University smithv@andrew.cmu.edu Aditi Raghunathan Carnegie Mellon University aditirag@andrew.cmu.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper mentions using existing open-source libraries ("WILDs [70] and RLSbench [30] open source libraries") and refers to an 'official library released with the paper' for Resnet on Cifar, providing a link to a third-party GitHub repository (https://github.com/kuangliu/pytorch-cifar). It does not explicitly state the release of the authors' own implementation code for their proposed methodology.
Open Datasets Yes For both UDA and SSL, we conduct experiments across eight benchmark datasets: four BREEDs datasets [72] Entity13, Entity30, Nonliving26, Living17; FMo W [47, 18] from WILDS benchmark; Officehome [85]; Visda [64, 63]; and CIFAR-10 [48].
Dataset Splits Yes We partition each source and target dataset into 80% and 20% i.i.d. splits. We use 80% splits for training and 20% splits for evaluation (or validation).
Hardware Specification Yes Our experiments were performed across a combination of Nvidia T4, A6000, and V100 GPUs.
Software Dependencies No The paper mentions 'pytorch implementation' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes We summarize the learning rate, batch size, number of epochs, and ℓ2 regularization parameter used in our study in Table 7.