reproducibilityindex.ai

Step Size Matters in Deep Learning

Authors: Kamil Nar, Shankar Sastry

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To demonstrate that small changes in the step size could lead to signiﬁcantly different solutions, we generated a piecewise continuous function f : [0, 1] ! R and estimated it with a two-layer network by minimizing... with two different step sizes δ 2 {2 10 4, 3 10 4}, where W 2 R1 20, V 2 R20, b 2 R20, N = 1000 and xi = i/N for all i 2 [N]. The initial values of W, V and the constant vector b were all drawn from independent standard normal distributions; and the vector b was kept the same for both of the step sizes used. As shown in Figure 2, training with δ = 2 10 4 converged to a ﬁxed solution, which provided an estimate ˆf close the original function f.
Researcher Affiliation	Academia	Kamil Nar S. Shankar Sastry Electrical Engineering and Computer Sciences University of California, Berkeley
Pseudocode	No	The paper contains mathematical equations and proofs but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for the experiment is available at https://github.com/nar-k/NeurIPS-2018.
Open Datasets	No	The paper generated a synthetic dataset for its experiment ("we generated a piecewise continuous function f : [0, 1] ! R and estimated it with a two-layer network by minimizing... N = 1000 and xi = i/N for all i 2 [N]"), but it does not provide access information for this specific generated data.
Dataset Splits	No	The paper describes using a dataset of N=1000 points for its experiment but does not specify any training, validation, or test splits.
Hardware Specification	No	The paper does not specify the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper describes the methods and activations used (e.g., "Re LU activations", "gradient descent algorithm") but does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x).
Experiment Setup	Yes	To demonstrate that small changes in the step size could lead to signiﬁcantly different solutions, we generated a piecewise continuous function f : [0, 1] ! R and estimated it with a two-layer network by minimizing... with two different step sizes δ 2 {2 10 4, 3 10 4}, where W 2 R1 20, V 2 R20, b 2 R20, N = 1000 and xi = i/N for all i 2 [N]. The initial values of W, V and the constant vector b were all drawn from independent standard normal distributions; and the vector b was kept the same for both of the step sizes used.