Equivariant Self-Supervised Learning: Encouraging Equivariance in Representations
Authors: Rumen Dangovski, Li Jing, Charlotte Loh, Seungwook Han, Akash Srivastava, Brian Cheung, Pulkit Agrawal, Marin Soljacic
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate E-SSL s effectiveness empirically on several popular computer vision benchmarks, e.g. improving Sim CLR to 72.5% linear probe accuracy on Image Net. Furthermore, we demonstrate usefulness of E-SSL for applications beyond computer vision; in particular, we show its utility on regression problems in photonics science. |
| Researcher Affiliation | Collaboration | Rumen Dangovski MIT EECS rumenrd@mit.edu; Li Jing Facebook AI Research ljng@fb.com; Charlotte Loh MIT EECS cloh@mit.edu; Seungwook Han MIT-IBM Watson AI Lab sh3264@columbia.edu; Akash Srivastava MIT-IBM Watson AI Lab akashsri@mit.edu; Brian Cheung MIT CSAIL & BCS cheungb@mit.edu; Pulkit Agrawal MIT CSAIL pulkitag@mit.edu; Marin Soljaˇci c MIT Physics soljacic@mit.edu |
| Pseudocode | Yes | Algorithm 1 Py Torch-style pseudocode for E-SSL, predicting four-fold rotations. |
| Open Source Code | Yes | Our code, datasets and pre-trained models are available at https://github.com/rdangovs/essl to aid further research in E-SSL. |
| Open Datasets | Yes | in our experiments on standard computer vision data, such as the small-scale CIFAR-10 (Torralba et al., 2008; Krizhevsky, 2009) and the large-scale Image Net (Deng et al., 2009) |
| Dataset Splits | Yes | We report the k NN accuracy in (%) on the validation set. |
| Hardware Specification | No | The paper mentions 'HPC and consultation resources' and 'GPU hours' but does not provide specific hardware details such as exact GPU/CPU models or memory amounts. |
| Software Dependencies | No | The paper mentions 'PyTorch-style pseudocode' but does not specify version numbers for PyTorch or any other software dependencies required to replicate the experiments. |
| Experiment Setup | Yes | Our experiments use the following architectural choices: Res Net-18 backbone (the CIFAR-10 version has kernel size 3, stride 1, padding 1 and there is no max pooling afterwards); 512 batch size (only our baseline Sim Siam model uses batch size 1024); 0.03 base learning rate for the baseline Sim CLR and Sim Siam and 0.06 base learning rate for E-Sim CLR and E-Sim Siam; 800 pre-training epochs; standard cosine decayed learning rate; 10 epochs for the linear warmup; two layer projector with hidden dimension 2048 and output dimension 2048; for Sim Siam a two layer (bottleneck) predictor with hidden dimension 512 whose learning rate is not decayed; the last batch normalization for the projector does not have learnable affine parameters; 0.0005 weight decay value; SGD with momentum 0.9 optimizer. The augmentation is Random Resized Cropping with scale (0.2, 1.0), aspect ratio (3/4, 4/3) and size 32x32, Random horizontal Flips with probability 0.5, Color Jittering (0.4, 0.4, 0.4, 0.1) with probability 0.8 and Grayscale with probability 0.2. |