Local Regularizer Improves Generalization
Authors: Yikai Zhang, Hui Qu, Dimitris Metaxas, Chao Chen6861-6868
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical results are supported by experiments. We observe consistently better generalization performance of LRSGD-R and LRSGD-C over SGD on different neural net architectures. and 5 Experiments We empirically show the generalization power of LRSGDR and LRSGD-C. We show that they generalize better than SGD for different network architectures. |
| Researcher Affiliation | Academia | Yikai Zhang,1 Hui Qu,1* Dimitris Metaxas,1 Chao Chen2 1Department of Computer Science, Rutgers University 2Departments of Biomedical Informatics, Stony Brook University {yz422, hui.qu, dnm}@cs.rutgers.edu, chao.chen.cchen@gmail.com |
| Pseudocode | Yes | Algorithm 1 SGD and Algorithm 2 LRSGD |
| Open Source Code | Yes | The code is available at https://github.com/huiqu18/LRSGD. |
| Open Datasets | Yes | All experiments are based on the CIFAR10 dataset which consists of 10 classes of 32 32 color images, with 6k images per class (Krizhevsky and Hinton 2009). |
| Dataset Splits | No | They are split into train and test sets with 50k and 10k images, respectively. The paper does not explicitly state a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or cloud instance types used for the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., Python, PyTorch, TensorFlow versions or other library versions). |
| Experiment Setup | Yes | The momentum and weight decay parameters of SGD are set to be 0.9 and 0.0001. The number of iteration is 13.7e4 (350 epochs), the batch size is 128, and the learning rate is α = 0.1 initially and decayed by 10 in iteration 5.8e4 and 9.8e4 (epoch 150 and 250). For LRSGD-R, we set γ = 0.1, λt = 0.01/α. For LRSGDC, Mt = 10 and λt = 0.01/α. |