Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the training dynamics of deep networks with $L_2$ regularization
Authors: Aitor Lewkowycz, Guy Gur-Ari
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | These empirical relations hold when the network is overparameterized. They can be used to predict the optimal regularization parameter of a given model. In addition, based on these observations we propose a dynamical schedule for the regularization parameter that improves performance and speeds up training. We test these proposals in modern image classification settings. We now turn to an empirical study of networks trained with L2 regularization. In this section we present results for a fully-connected network trained on MNIST, a Wide Res Net [Zagoruyko and Komodakis, 2016] trained on CIFAR-10, and CNNs trained on CIFAR-10. |
| Researcher Affiliation | Industry | Aitor Lewkowycz Google Mountain View, CA EMAIL Guy Gur-Ari Google Mountain View, CA EMAIL |
| Pseudocode | No | The paper describes methods and theoretical derivations but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | In this section we present results for a fully-connected network trained on MNIST, a Wide Res Net [Zagoruyko and Komodakis, 2016] trained on CIFAR-10, and CNNs trained on CIFAR-10. |
| Dataset Splits | No | The paper mentions evaluating on datasets like MNIST and CIFAR-10 and refers to 'Test accuracy' but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or detailed splitting methodology in the main text. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or TPU versions) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch 1.9 or specific programming language versions like Python 3.8 tied to experimental setup). |
| Experiment Setup | Yes | Figure 1: Wide Res Net 28-10 trained on CIFAR-10 with momentum and data augmentation. a Wide Res Net trained on CIFAR-10 with momentum= 0.9 , learning rate = 0.2 and data augmentation. |