Provable Tempered Overfitting of Minimal Nets and Typical Nets

Authors: Itamar Harel, William Hoza, Gal Vardi, Itay Evron, Nati Srebro, Daniel Soudry

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.
Researcher Affiliation Collaboration Itamar Harel Technion itamarharel01@gmail.com William M. Hoza The University of Chicago Gal Vardi Weizmann Institute of Science Itay Evron Technion Nathan Srebro Toyota Technological Institute at Chicago Daniel Soudry Technion
Pseudocode No The paper describes conceptual frameworks (e.g., 'Framework 1 Learning interpolators') and illustrates constructions with diagrams (e.g., Figure 2). However, it does not contain any formal pseudocode or algorithm blocks with explicit labels like 'Algorithm' or 'Pseudocode'.
Open Source Code No The paper does not contain any explicit statements about releasing open-source code for the methodology or provide a link to a code repository. The NeurIPS checklist also indicates 'No data or models are released.'
Open Datasets No The paper defines a theoretical 'Data distribution' model (Section 2.2) and a 'training set S' sampled from it for its theoretical analysis. It does not use or provide concrete access information (link, DOI, citation) for a specific, publicly available dataset for empirical training.
Dataset Splits No The paper is theoretical and does not describe empirical experiments. Consequently, no information on training, validation, or test dataset splits is provided.
Hardware Specification No The paper is purely theoretical and does not describe any empirical experiments. Therefore, no hardware specifications, such as specific GPU or CPU models, are mentioned.
Software Dependencies No The paper is theoretical and does not describe any empirical experiments. As such, it does not list any software dependencies with specific version numbers.
Experiment Setup No The paper is theoretical and does not describe empirical experiments. Thus, no specific experimental setup details, such as hyperparameters or training configurations, are provided.