Uniform convergence may be unable to explain generalization in deep learning
Authors: Vaishnavh Nagarajan, J. Zico Kolter
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | While it is well-known that many of these existing bounds are numerically large, through numerous experiments, we bring to light a more concerning aspect of these bounds: in practice, these bounds can increase with the training dataset size. |
| Researcher Affiliation | Collaboration | Vaishnavh Nagarajan Department of Computer Science Carnegie Mellon University Pittsburgh, PA vaishnavh@cs.cmu.edu J. Zico Kolter Department of Computer Science Carnegie Mellon University & Bosch Center for Artificial Intelligence Pittsburgh, PA zkolter@cs.cmu.edu |
| Pseudocode | No | The paper describes algorithms and concepts in prose but does not include any explicitly labeled "Pseudocode", "Algorithm" blocks, or structured steps formatted like code. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code, nor does it provide a direct link to a code repository for the methodology described. |
| Open Datasets | Yes | We focus on fully connected networks of depth d = 5, width h = 1024 trained on MNIST |
| Dataset Splits | No | The paper mentions training data and test data/set but does not explicitly describe a validation dataset or its split percentages/counts. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only describes the neural network architecture and training parameters. |
| Software Dependencies | No | The paper does not list specific software dependencies with their version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries with versions). |
| Experiment Setup | Yes | We use SGD with learning rate 0.1 and batch size 1 to minimize cross-entropy loss until 99% of the training data are classified correctly by a margin of at least γ = 10 |