reproducibilityindex.ai

Adam Can Converge Without Any Modification On Update Rules

Authors: Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run experiments for different choices of (β1, β2) on a few tasks. First, we run Adam for a convex function (2) with ﬁxed n (see the deﬁnition in Section 3.2). Second, we run Adam for the classiﬁcation problem on data MNIST and CIFAR-10 with ﬁxed batchsize. We observe some interesting phenomena in Figure 1 (a), (b) and (c). While Adam s performances seem unstable in the red region, we ﬁnd that Adam always performs well in the top blue region in Figure 1.
Researcher Affiliation	Academia	1The Chinese University of Hong Kong, Shenzhen, China 2University of Michigan, US 3Shenzhen Research Institute of Big Data
Pseudocode	Yes	We present randomly shufﬂed Adam in Algorithm 1. In Algorithm 1, m denotes the 1st-order momentum and v denotes the 2nd-order momentum. they are weighted averaged by hyperparameter β1, β2, respectively.
Open Source Code	No	No explicit statement or link providing concrete access to source code for the described methodology was found. The paper mentions 'open-source' only in the context of general deep learning libraries, not for their specific implementation.
Open Datasets	Yes	Second, we run Adam for the classiﬁcation problem on data MNIST and CIFAR-10 with ﬁxed batchsize. These datasets are commonly referenced and cited: Deng, L. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141 142, 2012. Krizhevsky, A., Hinton, G., et al. Learning multiple layers of features from tiny images. 2009.
Dataset Splits	No	No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning was found. The paper mentions using 'fixed batchsize' for MNIST and CIFAR-10 experiments, but no train/validation/test splits.
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were found. The paper describes the experimental setup but does not specify hardware.
Software Dependencies	No	No specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment were found. The paper does not mention software versions.
Experiment Setup	Yes	All the experimental settings and hyperparameters are presented in Appendix B.1. (Referring to detailed experiment setup in the appendix).