Improving Self-Supervised Learning by Characterizing Idealized Representations

Authors: Yann Dubois, Stefano Ermon, Tatsunori B. Hashimoto, Percy S. Liang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our resulting SSL algorithms outperform baselines on standard benchmarks, including Sw AV+multicrops on linear probing of Image Net.
Researcher Affiliation Academia Yann Dubois, Tatsunori Hashimoto, Stefano Ermon, Percy Liang Stanford University {yanndubs,thashim,ermon,pliang}@stanford.edu
Pseudocode Yes Algorithm 1 Batched DISSL
Open Source Code Yes all the code to reproduce our results can be found at github.com/Yann Dubs/Invariant-Self-Supervised-Learning.
Open Datasets Yes For the first experiments, we use Tiny Image Net [56] [...] Our models outperform baselines on Image Net. All models use Res Net50, 100 epochs, 2560 batch size. [...] Tiny Image Net [56], Image Net [59]
Dataset Splits Yes For the first experiments, we use Tiny Image Net [56] [...] Tiny Image Net [56], Image Net [59]
Hardware Specification No The paper mentions general model architectures like "Res Net18s" and "Res Net50" but does not specify the exact GPU/CPU models, memory, or cloud provider used for experiments. The ethics checklist explicitly states "[No]" for "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?"
Software Dependencies No The paper mentions using "VISSL’s codebase [60]" but does not provide specific version numbers for software libraries such as PyTorch, TensorFlow, CUDA, or other dependencies.
Experiment Setup Yes For the first experiments, we use Tiny Image Net [56], 300 pretraining epochs, and Res Net18s. [...] We use standard Tiny Image Net augmentations [34] (color jittering, grayscaling, cropping) with a parameter controlling the probability and strength of augmentations to study the effect of coarsening. [...] To test this relation, we trained 80 CISSL models with various hyperparameters, while fixing augmentations and negatives k. [...] Our models outperform baselines on Image Net. All models use Res Net50, 100 epochs, 2560 batch size.