Understanding the Role of Equivariance in Self-supervised Learning

Authors: Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie Jegelka

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Since this work is theory-oriented to fill the gap between practice and theory by investigating how E-SSL works, we do not explore extensively for a better E-SSL design. and Figure 1 reveals big differences between different choices of transformations: with linear probing, four-fold rotation and vertical flip perform the best and attain more than 60% accuracy, while the others do not even attain significant gains over random initialization (34%).
Researcher Affiliation Academia Yifei Wang MIT Kaiwen Hu* Peking University Sharut Gupta MIT Ziyu Ye The University of Chicago Yisen Wang Peking University Stefanie Jegelka TUM and MIT
Pseudocode No The paper describes methods and theoretical analysis in text and mathematical formulas, but does not include any specific pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/kaotty/Understanding-ESSL.
Open Datasets Yes We study seven common transformations for E-SSL on CIFAR-10 [45] with Res Net-18: (Reference 45 is 'Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.'). Also, 'train the model for 200 epochs on CIFAR-10 and CIFAR-100 respectively with batch size 512 and weight decay 10^-6' and 'train the model for 100 epochs on Tiny-Image Net-200'.
Dataset Splits No The paper mentions training and testing but does not explicitly specify a validation dataset split (e.g., percentages or sample counts) for reproducibility.
Hardware Specification Yes All experiments are conducted with a single NVIDIA RTX 3090 GPU.
Software Dependencies No The paper mentions using ResNet-18 and ResNet-50 as backbones but does not provide specific software dependency versions (e.g., Python, PyTorch, TensorFlow versions or other libraries with version numbers).
Experiment Setup Yes Under each transformation, we train the model for 200 epochs on CIFAR-10, with batch size 512 and weight decay 10^-6. and We choose λ1 = 0.5 and λ2 = 9