Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Failures of Model-dependent Generalization Bounds for Least-norm Interpolation
Authors: Peter L. Bartlett, Philip M. Long
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We consider bounds on the generalization performance of the least-norm linear regressor, in the over-parameterized regime where it can interpolate the data. We describe a sense in which any generalization bound of a type that is commonly proved in statistical learning theory must sometimes be very loose when applied to analyze the least-norm interpolant. In particular, for a variety of natural joint distributions on training examples, any valid generalization bound that depends only on the output of the learning algorithm, the number of training examples, and the confidence parameter, and that satisfies a mild condition (substantially weaker than monotonicity in sample size), must sometimes be very loose it can be bounded below by a constant when the true excess risk goes to zero. Keywords: generalization bounds, benign overfitting, linear regression, statistical learning theory, lower bounds |
| Researcher Affiliation | Collaboration | Peter L. Bartlett EMAIL University of California, Berkeley & Google, 367 Evans Hall #3860 Berkeley, CA 94720-3860. Philip M. Long EMAIL Google, 1600 Amphitheatre Parkway, Mountain View, CA, 94043. |
| Pseudocode | No | The paper defines concepts and provides lemmas and proofs, but does not include any structured pseudocode or algorithm blocks. For example, Definition 6 describes Pn in numbered steps, but this is a definition, not an algorithm. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to code repositories or mention code in supplementary materials. |
| Open Datasets | No | The paper discusses theoretical probability distributions and data generation processes (e.g., "joint distribution Dn on (x, y)-pairs is defined as follows. Let s = n , N = s2, d = N2. Let θ be an arbitrary unit-length vector. Let Σs be an arbitrary covariance matrix with eigenvalues λ1 = 1/81, λ2 = = λd = 1/d2. The marginal of Dn on x is then N(0, Σs).") rather than using specific, publicly available empirical datasets. No links, DOIs, or citations to external datasets are provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on empirical datasets, therefore, there is no mention of dataset splits (e.g., training, validation, test splits). |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup that would require hardware. Therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper focuses on theoretical analysis and proofs, and as such, it does not mention any specific software dependencies or versions required to replicate experiments. |
| Experiment Setup | No | The paper presents a theoretical analysis and proofs, and does not involve empirical experiments. Therefore, there are no details provided regarding experimental setup, hyperparameters, or training configurations. |