Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Unifying Framework for Representation Learning

Authors: Shaden Alshammari, John Hershey, Axel Feldmann, William Freeman, Mark Hamilton

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We not only present a wide array of proofs, connecting over 23 different approaches, but we also leverage these theoretical results to create state-of-the-art unsupervised image classifiers that achieve a +8% improvement over the prior state-of-the-art on unsupervised classification on Image Net-1K. ... We evaluate the I-Con framework using the Image Net-1K dataset (Deng et al., 2009), which consists of 1,000 classes and over one million high-resolution images. This dataset is considered one of the most challenging benchmarks for unsupervised image classification due to its scale and complexity. To ensure a fair comparison with prior works, we strictly adhere to the experimental protocol introduced by (Adaloglou et al., 2023). The primary metric for evaluating clustering performance is Hungarian accuracy...
Researcher Affiliation	Collaboration	Shaden Alshammari1 John Hershey2 Axel Feldmann1 William T. Freeman1,2 Mark Hamilton1,3 1 MIT 2 Google 3 Microsoft
Pseudocode	No	Figure 3 provides "code-style configurations" for SNE, Sim CLR, and K-Means, which illustrate how these methods can be expressed using the I-Con framework's components (e.g., SNE_model = ICon(...)). These are not clearly labeled pseudocode or algorithm blocks describing the I-Con algorithm itself, but rather configuration examples.
Open Source Code	Yes	https://aka.ms/i-con
Open Datasets	Yes	We evaluate the I-Con framework using the Image Net-1K dataset (Deng et al., 2009)... We use I-Con to design a debiasing strategy that improves unsupervised Image Net-1K accuracy by +8%, with additional gains of +3% on CIFAR-100 and +2% on STL-10 in linear probing. ... The models were trained on the CIFAR-10 dataset for 1000 epochs...
Dataset Splits	Yes	We evaluate the I-Con framework using the Image Net-1K dataset (Deng et al., 2009)... To ensure a fair comparison with prior works, we strictly adhere to the experimental protocol introduced by (Adaloglou et al., 2023). ... The models were trained on the CIFAR-10 dataset for 1000 epochs... For evaluation, we used two methods: (1) linear probing on the 512-dimensional embeddings from the MLP s hidden layer, and (2) k-nearest neighbors (k = 3) classification based on the same embeddings for CIFAR-10 (in-distribution) and CIFAR-100 (out-of-distribution).
Hardware Specification	No	The paper mentions using DiNO pre-trained Vision Transformer (Vi T) models and different sized backbones (Vi T-S/14, Vi T-B/14, and Vi T-L/14) but does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "ADAM (Kingma & Ba, 2017)" as an optimizer. However, it does not specify version numbers for any programming languages, libraries, or other software components used in the experiments.
Experiment Setup	Yes	The training process involved optimizing a linear classifier on top of the features extracted by the Di NO models. Each model was trained for 30 epochs, using ADAM (Kingma & Ba, 2017) with a batch size of 4096 and an initial learning rate of 1e-3. We decayed the learning rate by a factor of 0.5 every 10 epochs to allow for stable convergence. We do not apply additional normalization to the feature vectors.