Reverse Engineering Self-Supervised Learning
Authors: Ido Ben-Shaul, Ravid Shwartz-Ziv, Tomer Galanti, Shai Dekel, Yann LeCun
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper presents an in-depth empirical analysis of SSL-trained representations, encompassing diverse models, architectures, and hyperparameters. In this paper, we conduct a comprehensive empirical exploration of SSL-trained representations and their clustering properties concerning semantic classes. |
| Researcher Affiliation | Collaboration | Ido Ben-Shaul Department of Applied Mathematics Tel-Aviv University & e Bay Research; Ravid Shwartz-Ziv New York University; Tomer Galanti Massachusetts Institute of Technology; Shai Dekel Department of Applied Mathematics Tel-Aviv University; Yann Le Cun New York University & Meta AI, FAIR |
| Pseudocode | No | The paper describes the loss functions and architecture details in text and mathematical formulas but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states "Our experiments were implemented in Py Torch [53], utilizing the Lightly [61] library for SSL models and Py Torch Lightning [25] for training." However, it does not explicitly state that the authors are releasing their own source code for the methodology described in the paper, nor does it provide a link to it. |
| Open Datasets | Yes | Throughout all of the experiments (in the main text) we used the CIFAR-100 [41] image classification dataset. Additional evaluations on various datasets (CIFAR-10 [41], FOOD101 [11], Aircrafts [45], Tiny Imagnet [43], and Imagenet [19]) can be found in Appendix B.2. |
| Dataset Splits | No | The paper mentions "training dataset" and "test set" for experiments on CIFAR-100 and other datasets, and for a custom sample-level classification dataset, it details the composition of its training (50000 samples) and test (10000 samples) sets. While "validation set" is mentioned in the context of linear probing, explicit details on a separate validation split for the main model training are not provided. |
| Hardware Specification | Yes | All of our models were trained on V100 GPUs. |
| Software Dependencies | No | The paper states "Our experiments were implemented in Py Torch [53], utilizing the Lightly [61] library for SSL models and Py Torch Lightning [25] for training." However, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | Each SSL training session is carried out for 1000 epochs, using the SGD optimizer with momentum. We trained all of our SSL models for 1000 epochs using a batch size of 256. We used the SGD optimizer with a learning rate of 0.002, a momentum value of 0.9, and a weight decay value of 1e 6. Additionally, we used a Cosine Annealing learning rate scheduler [44]. |