Spectrally-normalized margin bounds for neural networks
Authors: Peter L. Bartlett, Dylan J. Foster, Matus J. Telgarsky
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical investigation, in Section 2, of neural network generalization on the standard datasets cifar10, cifar100, and mnist using the preceding bound. |
| Researcher Affiliation | Academia | <peter@berkeley.edu>; University of California, Berkeley and Queensland University of Technology. <djf244@cornell.edu>; Cornell University. <mjt@illinois.edu>; University of Illinois, Urbana-Champaign. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions implementing experiments in Keras and citing a Keras reference, but does not provide a link or statement for its own open-source code. |
| Open Datasets | Yes | An empirical investigation, in Section 2, of neural network generalization on the standard datasets cifar10, cifar100, and mnist using the preceding bound. |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits with percentages or sample counts, nor does it reference predefined splits with citations for reproducibility. |
| Hardware Specification | No | M.T. and D.F. acknowledge the use of a GPU machine provided by Karthik Sridharan and made possible by an NVIDIA GPU grant. This mentions a GPU but lacks specific model numbers or detailed specifications. |
| Software Dependencies | No | All experiments were implemented in Keras [Chollet et al., 2015]. The paper mentions Keras but does not specify a version number for it, which is needed for reproducibility. |
| Experiment Setup | Yes | All experiments were implemented in Keras [Chollet et al., 2015]. In order to minimize conflating effects of optimization and regularization, the optimization method was vanilla SGD with step size 0.01, and all regularization (weight decay, batch normalization, etc.) were disabled. cifar in general refers to cifar10, however cifar100 will also be explicitly mentioned. The network architecture is essentially Alex Net [Krizhevsky et al., 2012] with all normalization/regularization removed, and with no adjustments of any kind (even to the learning rate) across the different experiments. |