Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Authors: Yang Zhao, Hao Zhang, Xiuyuan Hu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we confirm that when using our methods, generalization performance of various models could be improved on different datasets. |
| Researcher Affiliation | Academia | 1Department of Electronic Engineering, Tsinghua University. |
| Pseudocode | Yes | Algorithm 1 Optimization Scheme of Penalizing Gradient Norm |
| Open Source Code | Yes | Code is available at https://github.com/zhaoyang-0204/gnp. |
| Open Datasets | Yes | In our experiments, we apply extensive model architectures on Cifar-{10, 100} datasets and Image Net datasets, respectively. |
| Dataset Splits | No | The paper mentions grid searches for hyperparameters but does not explicitly describe a dedicated validation dataset split for reproducibility. It discusses training with random seeds and reporting performance on testing sets. |
| Hardware Specification | Yes | all the experiments are deployed using the JAX framework on the NVIDIA DGX Station A100. |
| Software Dependencies | No | The paper mentions "JAX framework" but does not specify a version number for JAX or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | We would adopt a greedy strategy to reduce tuning cost during implementation, ... We would next perform a grid search on the scaler r over the set {0.01, 0.02, 0.05, 0.1, 0.2}. ... After determining the best value of r, we would moreover perform a grid search on the balance coefficient α in the range 0.1 to 0.9 at an interval of 0.1. |