reproducibilityindex.ai

Truth or backpropaganda? An empirical investigation of deep learning theory

Authors: Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	TRUTH OR BACKPROPAGANDA? AN EMPIRICAL INVESTIGATION OF DEEP LEARNING THEORY We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike.
Researcher Affiliation	Academia	Micah Goldblum Department of Mathematics University of Maryland goldblum@umd.edu Jonas Geiping Department of Computer Science and Electrical Engineering University of Siegen jonas.geiping@uni-siegen.de Avi Schwarzschild Department of Mathematics University of Maryland avi1@umd.edu Michael Moeller Department of Computer Science and Electrical Engineering University of Siegen michael.moeller@uni-siegen.de Tom Goldstein Department of Computer Science University of Maryland tomg@umd.edu
Pseudocode	No	No pseudocode or algorithm blocks are explicitly labeled or structured like code.
Open Source Code	No	The paper does not contain any statements about making source code publicly available.
Open Datasets	Yes	We verify this by training a linear classiﬁer on CIFAR-10... We consider image classiﬁcation on CIFAR-10 and compare a two-layer MLP, a four-layer MLP, a simple 5-layer Conv Net, and a Res Net. ... In our experiments on CIFAR-10 and CIFAR-100, networks are trained using weight decay coefﬁcients from their respective original papers.
Dataset Splits	No	The paper mentions 'training set' and 'test data' but does not specify any validation split or explicit percentages/counts for train/val/test splits, nor does it refer to a specific predefined split with citation.
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, memory) are provided for the experimental setup.
Software Dependencies	No	The paper does not specify version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	Our experiments comparing regularizers all run for 300 epochs with an initial learning rate of 0.1 and decreases by a factor of 10 at epochs 100, 175, 225, and 275. We use the SGD optimizer with momentum 0.9. [...] When naturally training Res Net-18 and Skipless Res Net-18 models, we train with a batch size of 128 for 200 epochs with the learning rate initiated to 0.01 and decreasing by a factor of 10 at epochs 100, 150, 175, and 190 (for both CIFAR-10 and CIFAR-100). When adversarially training these two models on CIFAR-10 data, we use the same hyperparameters. [...] Adversarial training is done with an ℓ 7-step PGD attack with a step size of 2/255, and ϵ = 8/255. For all of the training described above we augment the data with random crops and horizontal ﬂips.