reproducibilityindex.ai

On the training dynamics of deep networks with $L_2$ regularization

Authors: Aitor Lewkowycz, Guy Gur-Ari

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	These empirical relations hold when the network is overparameterized. They can be used to predict the optimal regularization parameter of a given model. In addition, based on these observations we propose a dynamical schedule for the regularization parameter that improves performance and speeds up training. We test these proposals in modern image classiﬁcation settings. We now turn to an empirical study of networks trained with L2 regularization. In this section we present results for a fully-connected network trained on MNIST, a Wide Res Net [Zagoruyko and Komodakis, 2016] trained on CIFAR-10, and CNNs trained on CIFAR-10.
Researcher Affiliation	Industry	Aitor Lewkowycz Google Mountain View, CA alewkowycz@google.com Guy Gur-Ari Google Mountain View, CA guyga@google.com
Pseudocode	No	The paper describes methods and theoretical derivations but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets	Yes	In this section we present results for a fully-connected network trained on MNIST, a Wide Res Net [Zagoruyko and Komodakis, 2016] trained on CIFAR-10, and CNNs trained on CIFAR-10.
Dataset Splits	No	The paper mentions evaluating on datasets like MNIST and CIFAR-10 and refers to 'Test accuracy' but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or detailed splitting methodology in the main text.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or TPU versions) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch 1.9 or specific programming language versions like Python 3.8 tied to experimental setup).
Experiment Setup	Yes	Figure 1: Wide Res Net 28-10 trained on CIFAR-10 with momentum and data augmentation. a Wide Res Net trained on CIFAR-10 with momentum= 0.9 , learning rate = 0.2 and data augmentation.