Benign, Tempered, or Catastrophic: Toward a Refined Taxonomy of Overfitting
Authors: Neil Mallinar, James Simon, Amirhesam Abedsoltan, Parthe Pandit, Misha Belkin, Preetum Nakkiran
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then empirically study deep neural networks through the lens of our taxonomy, and find that those trained to interpolation are tempered, while those stopped early are benign. We hope our work leads to a more refined understanding of overfitting in modern learning. In Section 4, we empirically study overfitting for DNNs. We give evidence that standard DNNs trained to interpolation exhibit tempered overfitting, not benign overfitting, motivating the further study of tempered overfitting in the pursuit of understanding modern machine learning methods. |
| Researcher Affiliation | Collaboration | Neil Mallinar UC San Diego nmallina@ucsd.edu James B. Simon UC Berkeley james.simon@berkeley.edu Amirhesam Abedsoltan UC San Diego aabedsoltan@ucsd.edu Parthe Pandit UC San Diego parthepandit@ucsd.edu Mikhail Belkin UC San Diego mbelkin@ucsd.edu Preetum Nakkiran Apple & UC San Diego preetum@apple.com |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | Figure 2 depicts such an experiment: a Res Net is trained on a binary variant of CIFAR-10 with varying amounts of training label noise, and with increasing sample size n. In Figure 3 we demonstrate our taxonomy experimentally for two benign methods (k-NN and earlystopped MLPs) and two tempered methods (1-NN and interpolating MLPs) on a binary classification version of MNIST, with varying noise in the train labels. Figure 9 shows noise profiles for Wide Res Nets trained on a binary version of SVHN. The paper cites Krizhevsky et al. [2009] for CIFAR-10, Yann Le Cun [1998] for MNIST, and Yuval Netzer et al. [2011] for SVHN, which are all standard public datasets. |
| Dataset Splits | No | The paper mentions 'train set' and 'test set' but does not explicitly provide details about a 'validation set' or specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. It refers to 'clean test set' but no explicit validation split. |
| Hardware Specification | Yes | This work used the Extreme Science and Engineering Discovery Environment (XSEDE) [Towns et al., 2014], which is supported by NSF grant number ACI-1548562, Expanse CPU/GPU compute nodes, and allocations TG-CIS210104 and TG-CIS220009. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with specific versions like PyTorch 1.9 or TensorFlow 2.x) needed to replicate the experiments. |
| Experiment Setup | Yes | We provide full experimental details in Appendix C. Appendix C.2 Training Details: All neural networks in this paper are trained using the Adam optimizer [Kingma and Ba, 2015] with a batch size of 500, a learning rate of 0.001, and a weight decay of 0.0001. All networks are trained for 10,000 epochs or until training loss reached 10−5. |