Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Surprising properties of dropout in deep networks

Authors: David P. Helmbold, Philip M. Long

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complement our theoretical results we performed two sets of experiments. The ﬁrst set tests the scale dependence of dropout and weight decay, while the the second set examines its promotion of negative weights even when learning monotone functions. The code is accessible at https://www.dropbox.com/sh/6s2lcfrq17zshmp/AAAQ06u Da4g OAu Anw2MAgh EMa?dl=0
Researcher Affiliation	Collaboration	David P. Helmbold EMAIL Department of Computer Science University of California, Santa Cruz Santa Cruz, CA 95064, USA Philip M. Long EMAIL Google 1600 Amphitheatre Parkway Mountain View, CA 94043, USA
Pseudocode	No	The paper describes methods, theorems, lemmas, and proofs in prose and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is accessible at https://www.dropbox.com/sh/6s2lcfrq17zshmp/AAAQ06u Da4g OAu Anw2MAgh EMa?dl=0
Open Datasets	No	The paper describes generating training examples uniformly at random from [ 1, 1]K, or defines a small custom training set with specific inputs and labels. It does not provide concrete access information (link, DOI, citation) to a publicly available or open dataset.
Dataset Splits	No	The paper describes generating or defining small custom training sets (e.g., 'Ten training examples were generated', 'The training set consists of six inputs'). It does not provide specific details on training/test/validation splits, percentages, or references to predefined splits for reproduction.
Hardware Specification	No	The paper mentions that simulations were 'implemented using Torch' and 'with Keras on top of Tensor Flow', but it does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper mentions software frameworks like Torch, Keras, and Tensor Flow, and the optim package for Torch, but it does not specify any version numbers for these software components.
Experiment Setup	Yes	We used stochastic gradient using the optim package for Torch, with learning rate 0.01 1+0.00001t and momentum of 0.5, and a maximum of 100000 iterations. We used the standard architecture with K = 5 inputs, depth d = 2, n = 5 hidden nodes. WD was trained with dropout probability 1/2 and no weight decay. W2 was trained with weight decay with λ = 1/2 and no dropout. Wnone was trained without any regularization. These experiments were implemented with Keras on top of Tensor Flow and using SGD optimization with a learning rate of 0.005. Weight decay learning used a parameter of 0.05, and dropout training used a dropout rate of 0.5. We used the standard architecture with six inputs, 12 hidden nodes, and one output. We ran each of dropout and weight-decay for 10,000 epochs.