Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Convergence of Adam and Beyond
Authors: Sashank J. Reddi, Satyen Kale, Sanjiv Kumar
ICLR 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present empirical results on both synthetic and real-world datasets. For our experiments, we study the problem of multiclass classification using logistic regression and neural networks, representing convex and nonconvex settings, respectively. |
| Researcher Affiliation | Industry | Sashank J. Reddi, Satyen Kale & Sanjiv Kumar Google New York New York, NY 10011, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Generic Adaptive Method Setup; Algorithm 2 AMSGRAD; Algorithm 3 ADAMNC |
| Open Source Code | No | The paper does not contain any explicit statement about making its source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | We use MNIST dataset for this experiment, the classification is based on 784 dimensional image vector to one of the 10 class labels. ... Finally, we consider the multiclass classification problem on the standard CIFAR-10 dataset, which consists of 60,000 labeled examples of 32 × 32 images. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide details about specific validation dataset splits or methodology. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU model, CPU type) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We use a minibatch version of these algorithms with minibatch size set to 128. We set β1 = 0.9 and β2 is chosen from the set {0.99, 0.999} ... Furthermore, we use constant αt = α throughout all our experiments on neural networks. |