reproducibilityindex.ai

Dropout: Explicit Forms and Capacity Control

Authors: Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive numerical evaluations for validating our theory including verifying that the proposed theoretical bound on the Rademacher complexity is predictive of the observed generalization gap as well as highlighting how dropout breaks co-adaptation , a notion that was the main motivation behind the invention of dropout (Hinton et al., 2012).
Researcher Affiliation	Academia	1Johns Hopkins University. 2University of California, Berkeley. 3TTI Chicago.
Pseudocode	No	The paper describes the methods and processes mathematically and textually, but it does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statements about releasing source code for the methodology or provide a link to a code repository.
Open Datasets	Yes	We evaluate dropout on the Movie Lens dataset (Harper & Konstan, 2016), a publicly available collaborative ﬁltering dataset... We train 2-layer neural networks with and without dropout, on MNIST dataset of handwritten digits and Fashion MNIST dataset of Zalando s article images
Dataset Splits	No	The paper mentions training and test data, but does not specify details for a separate validation split, nor does it provide exact percentages or counts for how the datasets were partitioned for training, validation, and testing.
Hardware Specification	No	The paper does not mention any specific hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., specific libraries, frameworks, or operating systems).
Experiment Setup	Yes	We train the model for 100 epochs over the training data, where we use a ﬁxed learning rate of lr = 1, and a batch size of 2000... We initialize the factors using the standard He initialization scheme (He et al., 2015)... The learning rate in all experiments is set to lr = 1e 3. We train the models for 30 epochs over the training set.