On the Convergence of Adam and Beyond

Authors: Sashank J. Reddi, Satyen Kale, Sanjiv Kumar

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present empirical results on both synthetic and real-world datasets. For our experiments, we study the problem of multiclass classification using logistic regression and neural networks, representing convex and nonconvex settings, respectively.
Researcher Affiliation Industry Sashank J. Reddi, Satyen Kale & Sanjiv Kumar Google New York New York, NY 10011, USA {sashank,satyenkale,sanjivk}@google.com
Pseudocode Yes Algorithm 1 Generic Adaptive Method Setup; Algorithm 2 AMSGRAD; Algorithm 3 ADAMNC
Open Source Code No The paper does not contain any explicit statement about making its source code publicly available or provide a link to a code repository.
Open Datasets Yes We use MNIST dataset for this experiment, the classification is based on 784 dimensional image vector to one of the 10 class labels. ... Finally, we consider the multiclass classification problem on the standard CIFAR-10 dataset, which consists of 60,000 labeled examples of 32 × 32 images.
Dataset Splits No The paper mentions training and testing but does not explicitly provide details about specific validation dataset splits or methodology.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU model, CPU type) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We use a minibatch version of these algorithms with minibatch size set to 128. We set β1 = 0.9 and β2 is chosen from the set {0.99, 0.999} ... Furthermore, we use constant αt = α throughout all our experiments on neural networks.