Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent

Authors: Avrajit Ghosh, He Lyu, Xitong Zhang, Rongrong Wang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore the implicit regularization in (SGD+M) and (GD+M) through a series of experiments validating our theory.6 NUMERICAL EXPERIMENTS
Researcher Affiliation Academia Avrajit Ghosh He Lyu Xitong Zhang Rongrong Wang Department of Computational Mathematics, Science and Engineering (CMSE) Michigan State University
Pseudocode No The paper provides mathematical formulations for the algorithms but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (specific link, explicit statement of release) for the source code of the methodology described.
Open Datasets Yes Res Net-18 is used to classify a uniformly sub-sampled MNIST dataset with 1000 training images.trained to classify images from the CIFAR-10 and CIFAR-100 datasets.
Dataset Splits No The paper mentions using MNIST, CIFAR-10, and CIFAR-100 datasets but does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit references to standard splits).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., "Python 3.8, PyTorch 1.9").
Experiment Setup Yes All external regularization schemes except learning rate decay and batch normalization have been turned off.We fix the batch-size to 640 in all our experiments.combinations of (h, β) chosen such that the effective learning rate h (1 β) remains same.Table 1 lists specific "β /h" values.