Redundant representations help generalization in wide neural networks

Authors: Diego Doimo, Aldo Glielmo, Sebastian Goldt, Alessandro Laio

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report experimental results obtained with several architectures (fully connected networks, Wide Res Net-28, Dense Net40, Res Net50) and data sets (CIFAR10/100 [25], Image Net [27]).
Researcher Affiliation Collaboration Diego Doimo International School for Advanced Studies Aldo Glielmo International School for Advanced Studies Bank of Italy Sebastian Goldt International School for Advanced Studies Alessandro Laio International School for Advanced Studies
Pseudocode No The paper describes analytical methods in text and equations but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Reproducibility. We provide code to reproduce our experiments and our analysis online at https: //github.com/diegodoimo/redundant_representation.
Open Datasets Yes We report experimental results obtained with several architectures (fully connected networks, Wide Res Net-28, Dense Net40, Res Net50) and data sets (CIFAR10/100 [25], Image Net [27]).
Dataset Splits Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Methods section.
Hardware Specification Yes All our experiments are run on Volta V100 GPUs.
Software Dependencies No The paper describes methods like SGD with momentum and ridge regression but does not provide specific version numbers for any software libraries or frameworks used (e.g., PyTorch, TensorFlow, scikit-learn).
Experiment Setup Yes We train all the networks using SGD with momentum and, importantly, weight decay. The amount of weight decay is found with a small grid search, while the other relevant hyperparameters are set following standard practice. We give detailed information on our training setups in Sec. A of the Appendix.