reproducibilityindex.ai

Sharp Minima Can Generalize For Deep Nets

Authors: Laurent Dinh, Razvan Pascanu, Samy Bengio, Yoshua Bengio

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper argues that most notions of ﬂatness are problematic for deep models and can not be directly applied to explain generalization. Speciﬁcally, when focusing on deep networks with rectiﬁer units, we can exploit the particular geometry of parameter space induced by the inherent symmetries that these architectures exhibit to build equivalent models corresponding to arbitrarily sharper minima. Furthermore, if we allow to reparametrize a function, the geometry of its parameters can change drastically without affecting its generalization properties.
Researcher Affiliation	Collaboration	1Universit e of Montr eal, Montr eal, Canada 2Deep Mind, London, United Kingdom 3Google Brain, Mountain View, United States 4CIFAR Senior Fellow.
Pseudocode	No	No pseudocode or algorithm blocks are provided in the paper. The paper focuses on theoretical arguments and mathematical derivations.
Open Source Code	No	No statement regarding the release of open-source code for the methodology is found in the paper.
Open Datasets	No	The paper is theoretical and does not describe experiments using specific datasets, nor does it provide access information for any dataset it might implicitly refer to in background discussion.
Dataset Splits	No	The paper focuses on theoretical arguments and does not involve empirical training or validation splits of data.
Hardware Specification	No	The paper is theoretical and does not describe experimental setup or hardware specifications.
Software Dependencies	No	The paper is theoretical and does not list specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameter values or training configurations.