Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Disentangling the Mechanisms Behind Implicit Regularization in SGD
Authors: Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary Chase Lipton
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we conduct an extensive empirical evaluation, focusing on the ability of various theorized mechanisms to close the small-to-large batch generalization gap. |
| Researcher Affiliation | Academia | Zachary Novack UC San Diego EMAIL Simran Kaur Princeton University EMAIL Tanya Marwah Carnegie Mellon University EMAIL Saurabh Garg Carnegie Mellon University EMAIL Zachary Lipton Carnegie Mellon University EMAIL |
| Pseudocode | No | No pseudocode or algorithm block is explicitly presented in the paper. |
| Open Source Code | Yes | The source code for reproducing the work presented here, including all hyperparameters and random seeds, is available at https://github.com/acmi-lab/imp-regularizers. Additional experimental details are available in Appendix A.5. |
| Open Datasets | Yes | on the CIFAR10, CIFAR100, Tiny-Image Net, and SVHN image classification benchmarks (Krizhevsky, 2009; Le and Yang, 2015; Netzer et al., 2011). |
| Dataset Splits | No | Figure 1: Validation Accuracy and Average Micro-batch (|M| = 128) Gradient Norm for CIFAR10/100 Regularization Experiments, averaged across runs (plots also smoothed for clarity). |
| Hardware Specification | Yes | All experiments were run on a single RTX A6000 NVidia GPU. |
| Software Dependencies | No | All experiments run for the present paper were performed using the Pytorch deep learning API, and source code can be found here: https://github.com/anon2023ICLR/ imp-regularizers. |
| Experiment Setup | Yes | Additional experimental details can be found in Appendix A.5. Values for our hyperparameters in our main experiments are detailed below: Table 8: Learning rate (η) used in main experiments... Table 9: Regularization strength (λ) used in main experiments... All experiments were run for 50000 update iterations. No weight decay or momentum was used in any of the experiments. |