reproducibilityindex.ai

On the Noisy Gradient Descent that Generalizes as SGD

Authors: Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present our empirical results. The setup details are explained in Supplementary Materials, Section C. The code is available at https://github.com/uuujf/Multi Noise. Experiments In Figure 1 we test MSGD-Cov on various datasets and models. The results consistently suggest that the MSGD-Cov can generalize well as the vanilla SGD, though its noise belongs to a different distribution class. More interestingly, we observe that the MSGD-Cov converges faster than the vanilla SGD.
Researcher Affiliation	Collaboration	1Johns Hopkins University, Baltimore, MD, USA 2Missouri University of Science and Technology, Rolla, MO, USA 3Big Data Laboratory, Baidu Research, Beijing, China 4Styling.AI Inc., Beijing, China 5Peking University, Beijing, China.
Pseudocode	Yes	Algorithm 1 Multiplicative SGD and Algorithm 2 Mini-Batch Multiplicative SGD
Open Source Code	Yes	The code is available at https://github.com/uuujf/Multi Noise.
Open Datasets	Yes	Experiments In Figure 1 we test MSGD-Cov on various datasets and models. The results consistently suggest that the MSGD-Cov can generalize well as the vanilla SGD, though its noise belongs to a different distribution class. More interestingly, we observe that the MSGD-Cov converges faster than the vanilla SGD. (a) Small Fashion MNIST (b) Small SVHN (c) CIFAR-10
Dataset Splits	No	The paper mentions training sets (e.g., “1,000 samples from Fashion MNIST as the training set”) and reports test accuracy, but does not provide specific details on validation sets or the precise splitting methodology (percentages, counts) for training, validation, and test sets in the main text. It defers setup details to supplementary materials.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used for its experiments (e.g., GPU models, CPU types, or cloud computing specifications).
Software Dependencies	No	The paper does not specify version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	No	The paper mentions the models and datasets used (e.g., “small convolutional network”, “VGG-11”, “ResNet-18” on “Fashion MNIST”, “SVHN”, “CIFAR-10”) and general training conditions (e.g., “without Batch Normalization”, “without using data augmentation and weight decay”), but it explicitly states that “The setup details are explained in Supplementary Materials, Section C.” and does not provide specific hyperparameter values or detailed experimental configurations in the main text.