Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Interpolating Classifiers Make Few Mistakes
Authors: Tengyuan Liang, Benjamin Recht
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper provides elementary analyses of the regret and generalization of minimum-norm interpolating classifiers (MNIC). The MNIC is the function of smallest Reproducing Kernel Hilbert Space norm that perfectly interpolates a label pattern on a finite data set. We derive a mistake bound for MNIC and a regularized variant that holds for all data sets. This bound follows from elementary properties of matrix inverses. Under the assumption that the data is independently and identically distributed, the mistake bound implies that MNIC generalizes at a rate proportional to the norm of the interpolating solution and inversely proportional to the number of data points. This rate matches similar rates derived for margin classifiers and perceptrons. We derive several plausible generative models where the norm of the interpolating classifier is bounded or grows at a rate sublinear in n. We also show that as long as the population class conditional distributions are sufficiently separable in total variation, then MNIC generalizes with a fast rate. Keywords: Generalization, Regret, Mistake Bound, Interpolation, Least-squares classification. |
| Researcher Affiliation | Academia | Tengyuan Liang EMAIL Booth School of Business University of Chicago Chicago, IL 60637, USA Benjamin Recht EMAIL Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720, USA |
| Pseudocode | Yes | Algorithm 1: Online Minimum-Norm Interpolation Algorithm. Data: A sequence of data {zi}n i=1. A feature embedding ϕ : Rd Rp that corresponds to an RKHS. Result: A min-norm interpolation function bf Si 1 : X Y on Si 1 for each i = 1, . . . , n. Initialization: Set bβ0 = 0 Rp and bf S0 = 0, and i = 0 ; while i < n do Set i = i + 1 and receive a new data zi = (xi, yi) ; Make prediction based on the previous interpolator bf Si 1 and record the error ϵi = yi bf Si 1(xi). Update the vector bβi bβi = bβi 1 + ϵi Π i 1ϕxi 2 Π i 1ϕxi , and the corresponding min-norm interpolation function bf Si(x) = ϕx, bβi ; end |
| Open Source Code | No | The paper does not explicitly state that the source code for the methodology is being released, nor does it provide a link to a code repository. The license mentioned refers to the paper itself, not accompanying code. |
| Open Datasets | No | The paper is theoretical and focuses on mathematical analysis and generative models, rather than empirical evaluation on specific datasets. Therefore, it does not provide access information for any datasets. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on specific datasets. Consequently, there is no mention of dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical proofs and analyses. It does not mention any specific software dependencies or versions required to implement or reproduce the theoretical work. |
| Experiment Setup | No | The paper is theoretical and primarily provides mathematical analyses and proofs. It does not describe an experimental setup with specific hyperparameters, training configurations, or system-level settings. |