Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Representation Learning Dynamics of Self-Supervised Models

Authors: Pascal Esser, Satyaki Mukherjee, Debarghya Ghoshdastidar

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We numerically illustrate the validity of our theoretical findings, and discuss how the presented results provide a framework for further theoretical analysis of contrastive and non-contrastive SSL. We numerically show, on the MNIST dataset, that our derived SSL learning dynamics can be solved significantly faster than training nonlinear networks, and yet provide comparable accuracy on downstream tasks. For evaluation we use the following experimental setup: We train a network with contrastive loss as defined in (1) using gradient descent with learning rate 0.01 for 100 epochs and hidden layer size from 10 to 1000. We consider the following three loss functions: (1) sigmoid, (2) Re LU (ϕ(x) = max{x, 0}) and (3) tanh. The results are shown in Figure 2 where the plot shows the average over 10 initializations.
Researcher Affiliation	Academia	Pascal M. Esser EMAIL Technical University of Munich, Germany Satyaki Mukherjee EMAIL National University of Singapore, Singapore Debarghya Ghoshdastidar EMAIL Technical University of Munich, Germany
Pseudocode	No	The paper contains mathematical equations and differential equations to describe the learning dynamics, but no structured pseudocode or algorithm blocks are provided.
Open Source Code	No	No explicit statement regarding the release of source code or a link to a code repository is provided in the paper.
Open Datasets	Yes	For this illustration we now consider two classes with 200 datapoints each from the MNIST dataset Deng (2012).
Dataset Splits	No	The paper mentions using 'two classes with 200 datapoints each from the MNIST dataset', but does not specify any training, validation, or test splits, nor does it refer to standard splits for this subset of the data.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. It only describes the experimental setup in terms of training parameters and datasets.
Software Dependencies	No	The paper describes the experimental setup and methodology but does not list any specific software libraries, frameworks, or their version numbers that were used for implementation.
Experiment Setup	Yes	For evaluation we use the following experimental setup: We train a network with contrastive loss as defined in (1) using gradient descent with learning rate 0.01 for 100 epochs and hidden layer size from 10 to 1000. We consider the following three loss functions: (1) sigmoid, (2) Re LU (ϕ(x) = max{x, 0}) and (3) tanh.