Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Truth or backpropaganda? An empirical investigation of deep learning theory
Authors: Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | TRUTH OR BACKPROPAGANDA? AN EMPIRICAL INVESTIGATION OF DEEP LEARNING THEORY We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. |
| Researcher Affiliation | Academia | Micah Goldblum Department of Mathematics University of Maryland EMAIL Jonas Geiping Department of Computer Science and Electrical Engineering University of Siegen EMAIL Avi Schwarzschild Department of Mathematics University of Maryland EMAIL Michael Moeller Department of Computer Science and Electrical Engineering University of Siegen EMAIL Tom Goldstein Department of Computer Science University of Maryland EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks are explicitly labeled or structured like code. |
| Open Source Code | No | The paper does not contain any statements about making source code publicly available. |
| Open Datasets | Yes | We verify this by training a linear classifier on CIFAR-10... We consider image classification on CIFAR-10 and compare a two-layer MLP, a four-layer MLP, a simple 5-layer Conv Net, and a Res Net. ... In our experiments on CIFAR-10 and CIFAR-100, networks are trained using weight decay coefficients from their respective original papers. |
| Dataset Splits | No | The paper mentions 'training set' and 'test data' but does not specify any validation split or explicit percentages/counts for train/val/test splits, nor does it refer to a specific predefined split with citation. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) are provided for the experimental setup. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Our experiments comparing regularizers all run for 300 epochs with an initial learning rate of 0.1 and decreases by a factor of 10 at epochs 100, 175, 225, and 275. We use the SGD optimizer with momentum 0.9. [...] When naturally training Res Net-18 and Skipless Res Net-18 models, we train with a batch size of 128 for 200 epochs with the learning rate initiated to 0.01 and decreasing by a factor of 10 at epochs 100, 150, 175, and 190 (for both CIFAR-10 and CIFAR-100). When adversarially training these two models on CIFAR-10 data, we use the same hyperparameters. [...] Adversarial training is done with an ℓ 7-step PGD attack with a step size of 2/255, and ϵ = 8/255. For all of the training described above we augment the data with random crops and horizontal flips. |