Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving Self-Supervised Learning by Characterizing Idealized Representations
Authors: Yann Dubois, Stefano Ermon, Tatsunori B. Hashimoto, Percy S. Liang
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our resulting SSL algorithms outperform baselines on standard benchmarks, including Sw AV+multicrops on linear probing of Image Net. |
| Researcher Affiliation | Academia | Yann Dubois, Tatsunori Hashimoto, Stefano Ermon, Percy Liang Stanford University EMAIL |
| Pseudocode | Yes | Algorithm 1 Batched DISSL |
| Open Source Code | Yes | all the code to reproduce our results can be found at github.com/Yann Dubs/Invariant-Self-Supervised-Learning. |
| Open Datasets | Yes | For the first experiments, we use Tiny Image Net [56] [...] Our models outperform baselines on Image Net. All models use Res Net50, 100 epochs, 2560 batch size. [...] Tiny Image Net [56], Image Net [59] |
| Dataset Splits | Yes | For the first experiments, we use Tiny Image Net [56] [...] Tiny Image Net [56], Image Net [59] |
| Hardware Specification | No | The paper mentions general model architectures like "Res Net18s" and "Res Net50" but does not specify the exact GPU/CPU models, memory, or cloud provider used for experiments. The ethics checklist explicitly states "[No]" for "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?" |
| Software Dependencies | No | The paper mentions using "VISSL’s codebase [60]" but does not provide specific version numbers for software libraries such as PyTorch, TensorFlow, CUDA, or other dependencies. |
| Experiment Setup | Yes | For the first experiments, we use Tiny Image Net [56], 300 pretraining epochs, and Res Net18s. [...] We use standard Tiny Image Net augmentations [34] (color jittering, grayscaling, cropping) with a parameter controlling the probability and strength of augmentations to study the effect of coarsening. [...] To test this relation, we trained 80 CISSL models with various hyperparameters, while fixing augmentations and negatives k. [...] Our models outperform baselines on Image Net. All models use Res Net50, 100 epochs, 2560 batch size. |