Stronger Generalization Bounds for Deep Nets via a Compression Approach

Authors: Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All are empirically studied, including their correlation with generalization (Section 6). Experiments were performed by training a VGG-19 architecture (Simonyan and Zisserman, 2014) and a Alex Net (Krizhevsky et al., 2012) for multi-class classification task on CIFAR-10 dataset.
Researcher Affiliation Academia 1Princeton University, Computer Science Department 2Duke University, Computer Science Department 3Institute for Advanced Study, School of Mathematics.
Pseudocode Yes Algorithm 1 Matrix-Project (A, ε, η)
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is open source or publicly available.
Open Datasets Yes Experiments were performed by training a VGG-19 architecture (Simonyan and Zisserman, 2014) and a Alex Net (Krizhevsky et al., 2012) for multi-class classification task on CIFAR-10 dataset.
Dataset Splits No The paper mentions '92.45% validation accuracy' for VGG-19, indicating the use of a validation set. However, it does not provide specific details on the split percentages or sample counts for this validation set, which are necessary for full reproducibility.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions optimization techniques like 'SGD with mini-batch size 128', but it does not list any specific software or library names with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.0').
Experiment Setup Yes Optimization used SGD with mini-batch size 128, weight decay 5e-4, momentum 0.9 and initial learning rate 0.05, but decayed by factor 2 every 30 epochs. Drop-out was used in fully-connected layers.