Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift
Authors: Saurabh Garg, Amrith Setlur, Zachary Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (e.g., BREEDs, WILDS), we demonstrate that the combined method obtains 3 8% higher accuracy than either approach independently. Finally, we theoretically analyze these techniques in a simplified model of distribution shift demonstrating scenarios under which the features produced by contrastive learning can yield a good initialization for self-training to further amplify gains and achieve optimal performance, even when either method alone would fail. |
| Researcher Affiliation | Academia | Saurabh Garg Carnegie Mellon University sgarg2@andrew.cmu.edu Amrith Setlur Carnegie Mellon University asetlur@andrew.cmu.edu Zachary C. Lipton Carnegie Mellon University zlipton@andrew.cmu.edu Sivaraman Balakrishnan Carnegie Mellon University sbalakri@andrew.cmu.edu Virginia Smith Carnegie Mellon University smithv@andrew.cmu.edu Aditi Raghunathan Carnegie Mellon University aditirag@andrew.cmu.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using existing open-source libraries ("WILDs [70] and RLSbench [30] open source libraries") and refers to an 'official library released with the paper' for Resnet on Cifar, providing a link to a third-party GitHub repository (https://github.com/kuangliu/pytorch-cifar). It does not explicitly state the release of the authors' own implementation code for their proposed methodology. |
| Open Datasets | Yes | For both UDA and SSL, we conduct experiments across eight benchmark datasets: four BREEDs datasets [72] Entity13, Entity30, Nonliving26, Living17; FMo W [47, 18] from WILDS benchmark; Officehome [85]; Visda [64, 63]; and CIFAR-10 [48]. |
| Dataset Splits | Yes | We partition each source and target dataset into 80% and 20% i.i.d. splits. We use 80% splits for training and 20% splits for evaluation (or validation). |
| Hardware Specification | Yes | Our experiments were performed across a combination of Nvidia T4, A6000, and V100 GPUs. |
| Software Dependencies | No | The paper mentions 'pytorch implementation' but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We summarize the learning rate, batch size, number of epochs, and ℓ2 regularization parameter used in our study in Table 7. |