Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Interpolating Classifiers Make Few Mistakes

Authors: Tengyuan Liang, Benjamin Recht

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper provides elementary analyses of the regret and generalization of minimum-norm interpolating classiﬁers (MNIC). The MNIC is the function of smallest Reproducing Kernel Hilbert Space norm that perfectly interpolates a label pattern on a ﬁnite data set. We derive a mistake bound for MNIC and a regularized variant that holds for all data sets. This bound follows from elementary properties of matrix inverses. Under the assumption that the data is independently and identically distributed, the mistake bound implies that MNIC generalizes at a rate proportional to the norm of the interpolating solution and inversely proportional to the number of data points. This rate matches similar rates derived for margin classiﬁers and perceptrons. We derive several plausible generative models where the norm of the interpolating classiﬁer is bounded or grows at a rate sublinear in n. We also show that as long as the population class conditional distributions are suﬃciently separable in total variation, then MNIC generalizes with a fast rate. Keywords: Generalization, Regret, Mistake Bound, Interpolation, Least-squares classiﬁcation.
Researcher Affiliation	Academia	Tengyuan Liang EMAIL Booth School of Business University of Chicago Chicago, IL 60637, USA Benjamin Recht EMAIL Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720, USA
Pseudocode	Yes	Algorithm 1: Online Minimum-Norm Interpolation Algorithm. Data: A sequence of data {zi}n i=1. A feature embedding ϕ : Rd Rp that corresponds to an RKHS. Result: A min-norm interpolation function bf Si 1 : X Y on Si 1 for each i = 1, . . . , n. Initialization: Set bβ0 = 0 Rp and bf S0 = 0, and i = 0 ; while i < n do Set i = i + 1 and receive a new data zi = (xi, yi) ; Make prediction based on the previous interpolator bf Si 1 and record the error ϵi = yi bf Si 1(xi). Update the vector bβi bβi = bβi 1 + ϵi Π i 1ϕxi 2 Π i 1ϕxi , and the corresponding min-norm interpolation function bf Si(x) = ϕx, bβi ; end
Open Source Code	No	The paper does not explicitly state that the source code for the methodology is being released, nor does it provide a link to a code repository. The license mentioned refers to the paper itself, not accompanying code.
Open Datasets	No	The paper is theoretical and focuses on mathematical analysis and generative models, rather than empirical evaluation on specific datasets. Therefore, it does not provide access information for any datasets.
Dataset Splits	No	The paper is theoretical and does not conduct experiments on specific datasets. Consequently, there is no mention of dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware specifications.
Software Dependencies	No	The paper is theoretical and focuses on mathematical proofs and analyses. It does not mention any specific software dependencies or versions required to implement or reproduce the theoretical work.
Experiment Setup	No	The paper is theoretical and primarily provides mathematical analyses and proofs. It does not describe an experimental setup with specific hyperparameters, training configurations, or system-level settings.