Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Universal Law of Robustness via Isoperimetry
Authors: Sebastien Bubeck, Mark Sellke
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a theoretical explanation for this phenomenon. We prove that for a broad class of data distributions and model classes, overparametrization is necessary if one wants to interpolate the data smoothly. Namely we show that smooth interpolation requires d times more parameters than mere interpolation, where d is the ambient data dimension. We prove this universal law of robustness for any smoothly parametrized function class with polynomial size weights, and any covariate distribution verifying isoperimetry (or a mixture thereof). |
| Researcher Affiliation | Collaboration | S ebastien Bubeck Microsoft Research EMAIL Mark Sellke Stanford University EMAIL |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about releasing code or links to a code repository for the methodology described. |
| Open Datasets | Yes | To put Theorem 1 in context, we compare to the empirical results presented in [MMS+18]. In the latter work, they consider the MNIST dataset which consists of n = 6 104 images in dimension 282 = 784. |
| Dataset Splits | No | This paper is theoretical and focuses on proving a mathematical law. It discusses existing empirical results from other papers (e.g., [MMS+18]) but does not define or use its own dataset splits (train/validation/test) to reproduce experiments. |
| Hardware Specification | No | The paper is theoretical and does not conduct experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental implementation, thus no software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper is theoretical and does not describe specific experiments with hyperparameters or training configurations conducted by the authors. |