Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
Authors: Avrajit Ghosh, He Lyu, Xitong Zhang, Rongrong Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore the implicit regularization in (SGD+M) and (GD+M) through a series of experiments validating our theory.6 NUMERICAL EXPERIMENTS |
| Researcher Affiliation | Academia | Avrajit Ghosh He Lyu Xitong Zhang Rongrong Wang Department of Computational Mathematics, Science and Engineering (CMSE) Michigan State University |
| Pseudocode | No | The paper provides mathematical formulations for the algorithms but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (specific link, explicit statement of release) for the source code of the methodology described. |
| Open Datasets | Yes | Res Net-18 is used to classify a uniformly sub-sampled MNIST dataset with 1000 training images.trained to classify images from the CIFAR-10 and CIFAR-100 datasets. |
| Dataset Splits | No | The paper mentions using MNIST, CIFAR-10, and CIFAR-100 datasets but does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit references to standard splits). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., "Python 3.8, PyTorch 1.9"). |
| Experiment Setup | Yes | All external regularization schemes except learning rate decay and batch normalization have been turned off.We fix the batch-size to 640 in all our experiments.combinations of (h, β) chosen such that the effective learning rate h (1 β) remains same.Table 1 lists specific "β /h" values. |