Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Convergence of Adam and Beyond

Authors: Sashank J. Reddi, Satyen Kale, Sanjiv Kumar

ICLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present empirical results on both synthetic and real-world datasets. For our experiments, we study the problem of multiclass classification using logistic regression and neural networks, representing convex and nonconvex settings, respectively.
Researcher Affiliation Industry Sashank J. Reddi, Satyen Kale & Sanjiv Kumar Google New York New York, NY 10011, USA EMAIL
Pseudocode Yes Algorithm 1 Generic Adaptive Method Setup; Algorithm 2 AMSGRAD; Algorithm 3 ADAMNC
Open Source Code No The paper does not contain any explicit statement about making its source code publicly available or provide a link to a code repository.
Open Datasets Yes We use MNIST dataset for this experiment, the classification is based on 784 dimensional image vector to one of the 10 class labels. ... Finally, we consider the multiclass classification problem on the standard CIFAR-10 dataset, which consists of 60,000 labeled examples of 32 × 32 images.
Dataset Splits No The paper mentions training and testing but does not explicitly provide details about specific validation dataset splits or methodology.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU model, CPU type) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We use a minibatch version of these algorithms with minibatch size set to 128. We set β1 = 0.9 and β2 is chosen from the set {0.99, 0.999} ... Furthermore, we use constant αt = α throughout all our experiments on neural networks.