reproducibilityindex.ai

Toward Understanding the Importance of Noise in Training Neural Networks

Authors: Mo Zhou, Tianyi Liu, Yan Li, Dachao Lin, Enlu Zhou, Tuo Zhao

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments are provided to support our theory.
Researcher Affiliation	Academia	1Peking University 2Georgia Institute of Technology.
Pseudocode	Yes	Algorithm 1 Perturbed Gradient Descent Algorithm with Noise Annealing
Open Source Code	No	The paper does not provide an explicit statement about the release of source code or a link to a code repository.
Open Datasets	No	The paper describes the generation of synthetic data ('training data is generated from a teacher network', 'independent Gaussian input'), but it does not specify a publicly available or open dataset that can be accessed via a link, DOI, or standard citation.
Dataset Splits	No	The paper mentions 'training' but does not provide specific details about training/validation/test splits or a separate validation set.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or version numbers (e.g., programming languages, libraries, frameworks with versions).
Experiment Setup	Yes	For the perturbed GD algorithm, we perform step size and noise annealing in an epoch-wise fashion: each simulation has 20 epochs with each epoch consisting of 400 iterations; The initial learning rate is 0.1 for both w and a, and geometrically decays with a ratio 0.8; The initial noise levels are given by (ρw, ρa) = (36, 1) and both geometrically decay with a ratio 0.4. For GD, the learning rate is 0.1 for both w and a. For SGD, we adopt a batch size of 4, and perform step size annealing in an epoch-wise fashion: The initial learning rate is 0.1, and geometrically decays with a ratio 0.4.