Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

The Low-Rank Simplicity Bias in Deep Networks

Authors: Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we make a series of empirical observations that investigate and extend the hypothesis that deeper networks are inductively biased to ﬁnd solutions with lower eﬀective rank embeddings. We show empirically that our claim holds true on ﬁnite width linear and non-linear models on practical learning paradigms and show that on natural data, these are often the solutions that generalize well.
Researcher Affiliation	Collaboration	Minyoung Huh EMAIL MIT CSAIL Hossein Mobahi EMAIL Google Research Richard Zhang EMAIL Adobe Research Brian Cheung EMAIL MIT CSAIL & BCS Pulkit Agrawal EMAIL MIT CSAIL Phillip Isola EMAIL MIT CSAIL
Pseudocode	No	The paper describes methods and transformations, particularly in Appendix C 'Expanding a non-linear network', but does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The training details for Image Net can be found in https://github.com/pytorch/examples/blob/master/imagenet. This link refers to third-party standard examples, not the authors' specific code for the methodology described in this paper. No explicit statement of code release by the authors is found.
Open Datasets	Yes	We leverage our observations to demonstrate linear over-parameterization by depth" can be used to achieve better generalization performance on CIFAR (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) without increasing modeling capacity. ... The kernel is constructed from the MNIST dataset.
Dataset Splits	Yes	We scale up our experiments to Image Net, a large-scale dataset consisting of 1.3 million images with 1000 classes, and show that our ﬁndings hold in practical settings. For these experiments, we use standardized architectures: Alex Net (Krizhevsky et al., 2012) which consists of 8-layers, and Res Net10 / Res Net18 (He et al., 2016) which consists of 10 and 18 layers, respectively. ... For all experiments rank(W ) = {1, 4, 16, 32, 64}, we use total of 128 training samples.
Hardware Specification	Yes	All models for image classiﬁcation are trained using Py Torch (Paszke et al., 2019) with RTX 2080Ti GPUs.
Software Dependencies	Yes	All models for image classiﬁcation are trained using Py Torch (Paszke et al., 2019) with RTX 2080Ti GPUs.
Experiment Setup	Yes	We train the model using SGD with a momentum of 0.9, and we do not use weight decay. ... For each model we trained using the learning rates [1.0, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001] ... All models are trained for 24000 epochs ... For all models, we step the learning rate by a factor of 10 at epoch 18000. ... For SGD, we used a mini-batch size of 32. ... For data augmentation, we apply a random horizontal ﬂip and random-resized crop.