Can contrastive learning avoid shortcut solutions?

Authors: Joshua Robinson, Li Sun, Ke Yu, Kayhan Batmanghelich, Stefanie Jegelka, Suvrit Sra

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we observe that IFM reduces feature suppression, and as a result improves performance on vision and medical imaging tasks. We train encoders with Res Net-18 backbone using Sim CLR [5]. To study correlations between the loss value and error on downstream tasks, we train 33 encoders on Trifeature and 7 encoders on STL-digits with different hyperparameter settings (see App. C.2 for full details on training and hyperparameters).
Researcher Affiliation Academia Joshua Robinson MIT CSAIL & LIDS joshrob@mit.edu Li Sun University of Pittsburgh lis118@pitt.edu Ke Yu University of Pittsburgh yu.ke@pitt.edu Kayhan Batmanghelich University of Pittsburgh kayhan@pitt.edu Stefanie Jegelka MIT CSAIL stefje@csail.mit.edu MIT LIDS suvrit@mit.edu
Pseudocode No The paper describes methods using mathematical formulations and descriptive text, but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes The code is available at: https://github. com/joshr17/IFM.
Open Datasets Yes We use two datasets with known semantic features: (1) In the Trifeature data, [16] each image is 128 × 128 and has three features: color, shape, and texture... and benchmarks IFM on Image Net100 [44] using Mo Co-v2... and COPDGene dataset [38]
Dataset Splits Yes All encoders are evaluated using the test accuracy of a linear classifier trained on the full training dataset (see Appdx. C.4 for full setup details). The values are the average of 5-fold cross validation with standard deviations.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions software like PyTorch and scikit-learn in its references but does not provide specific version numbers for these or any other software dependencies required for replication.
Experiment Setup Yes We train encoders with Res Net-18 backbone using Sim CLR [5]. All encoders have Res Net-50 backbones and are trained for 400 epochs (with the exception of on Image Net100, which is trained for 200 epochs). we train Res Net-18 encoders for 200 epochs with τ ∈ {0.05, 0.2, 0.5} and IFM using ε = 0.1 for simplicity. (see Appdx. C.4.1 for full setup details).