On the Convergence of Adam and Beyond
Authors: Sashank J. Reddi, Satyen Kale, Sanjiv Kumar
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present empirical results on both synthetic and real-world datasets. For our experiments, we study the problem of multiclass classification using logistic regression and neural networks, representing convex and nonconvex settings, respectively. |
| Researcher Affiliation | Industry | Sashank J. Reddi, Satyen Kale & Sanjiv Kumar Google New York New York, NY 10011, USA {sashank,satyenkale,sanjivk}@google.com |
| Pseudocode | Yes | Algorithm 1 Generic Adaptive Method Setup; Algorithm 2 AMSGRAD; Algorithm 3 ADAMNC |
| Open Source Code | No | The paper does not contain any explicit statement about making its source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | We use MNIST dataset for this experiment, the classification is based on 784 dimensional image vector to one of the 10 class labels. ... Finally, we consider the multiclass classification problem on the standard CIFAR-10 dataset, which consists of 60,000 labeled examples of 32 × 32 images. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide details about specific validation dataset splits or methodology. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU model, CPU type) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We use a minibatch version of these algorithms with minibatch size set to 128. We set β1 = 0.9 and β2 is chosen from the set {0.99, 0.999} ... Furthermore, we use constant αt = α throughout all our experiments on neural networks. |