Understanding the Role of Equivariance in Self-supervised Learning
Authors: Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie Jegelka
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Since this work is theory-oriented to fill the gap between practice and theory by investigating how E-SSL works, we do not explore extensively for a better E-SSL design. and Figure 1 reveals big differences between different choices of transformations: with linear probing, four-fold rotation and vertical flip perform the best and attain more than 60% accuracy, while the others do not even attain significant gains over random initialization (34%). |
| Researcher Affiliation | Academia | Yifei Wang MIT Kaiwen Hu* Peking University Sharut Gupta MIT Ziyu Ye The University of Chicago Yisen Wang Peking University Stefanie Jegelka TUM and MIT |
| Pseudocode | No | The paper describes methods and theoretical analysis in text and mathematical formulas, but does not include any specific pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/kaotty/Understanding-ESSL. |
| Open Datasets | Yes | We study seven common transformations for E-SSL on CIFAR-10 [45] with Res Net-18: (Reference 45 is 'Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.'). Also, 'train the model for 200 epochs on CIFAR-10 and CIFAR-100 respectively with batch size 512 and weight decay 10^-6' and 'train the model for 100 epochs on Tiny-Image Net-200'. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly specify a validation dataset split (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | Yes | All experiments are conducted with a single NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using ResNet-18 and ResNet-50 as backbones but does not provide specific software dependency versions (e.g., Python, PyTorch, TensorFlow versions or other libraries with version numbers). |
| Experiment Setup | Yes | Under each transformation, we train the model for 200 epochs on CIFAR-10, with batch size 512 and weight decay 10^-6. and We choose λ1 = 0.5 and λ2 = 9 |