Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Rethinking Benign Overfitting in Two-Layer Neural Networks

Authors: Ruichen Xu, Kexin Chen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental validation on both synthetic and real-world datasets supports our theoretical results. In this section, we first validate our theory in Theorem 3.4 by constructing datasets and models following our problem setup in Section 2. We further verify our conclusions with real-world datasets MNIST (Le Cun et al., 1998), CIFAR10, and CIFAR-100 (Krizhevsky et al., 2009).
Researcher Affiliation	Academia	1Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong, China 2School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China.
Pseudocode	No	The paper describes methods and results in paragraph form and mathematical derivations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any specific links to source code repositories, nor does it explicitly state that code will be released or is available in supplementary materials.
Open Datasets	Yes	We further verify our conclusions with real-world datasets MNIST (Le Cun et al., 1998), CIFAR10, and CIFAR-100 (Krizhevsky et al., 2009).
Dataset Splits	No	The paper mentions using synthetic and real-world datasets (MNIST, CIFAR-10, CIFAR-100) and refers to 'training dataset' and 'test data' implicitly, but it does not explicitly provide details on how these datasets were split into training, validation, or test sets (e.g., specific percentages, sample counts, or methodology for creating splits).
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for conducting the experiments.
Software Dependencies	No	The paper mentions using 'PyTorch' for initialization in the synthetic dataset experiments, but it does not provide any specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We train a two-layer neural network as defined in Section 2 with m = 100 neurons. We use the default initialization method in PyTorch to initialize the neurons parameters. We train the neural networks using GD with a learning rate η = 0.05 over 20 epochs.