reproducibilityindex.ai

Robustness to Unbounded Smoothness of Generalized SignSGD

Authors: Michael Crawshaw, Mingrui Liu, Francesco Orabona, Wei Zhang, Zhenxun Zhuang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results are shown in Section 5, comparing our algorithm with some popular competitors in deep learning tasks. We conducted our experiments using PyTorch [41] on Nvidia V100 GPUs.
Researcher Affiliation	Collaboration	Michael Crawshaw George Mason University mcrawsha@gmu.edu Mingrui Liu George Mason University mingruil@gmu.edu Francesco Orabona Boston University francesco@orabona.com Wei Zhang IBM T. J. Watson Research Center weiz@us.ibm.com Zhenxun Zhuang Meta Platforms, Inc. oldboymls@gmail.com
Pseudocode	Yes	Algorithm 1 Generalized Sign SGD (All operations on vectors are element-wise.)
Open Source Code	Yes	Codes can be found at https://github.com/zhenxun-zhuang/Generalized-Sign SGD.
Open Datasets	Yes	We employ the 20-layer Residual Network model [18] to do image classiﬁcation on the CIFAR-10 dataset. We adopt a 3-layer AWD-LSTM [35] to do language modeling on the Penn Treebank (PTB) dataset [33](word level).
Dataset Splits	Yes	We use grid-search to ﬁne-tune the initial learning rate for all optimizers, as well as the clipping threshold for SGDClip Grad and SGDClip Momentum, and β2 for Adam and our algorithm, to select the one giving the best validation performance on a separated validation set.
Hardware Specification	Yes	We conducted our experiments using PyTorch [41] on Nvidia V100 GPUs.
Software Dependencies	No	The paper mentions using 'PyTorch [41]' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	The mini-batch size is 128 and we train all algorithms for 164 epochs. We fixed the weight decay value to be 0.0001 and the momentum parameter (β1) to be 0.9.