Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Adam-family Methods with Decoupled Weight Decay in Deep Learning
Authors: Kuangyu Ding, Nachuan Xiao, Kim-chuan Toh
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate that Adam D outperforms Adam and is comparable to Adam W, in the aspects of both generalization performance and efficiency. [...] In this section, we conduct numerical experiments to demonstrate the effectiveness of Adam D in the context of image classification and language modeling tasks. |
| Researcher Affiliation | Academia | Kuangyu Ding EMAIL Edwardson School of Industrial Engineering Purdue University Nachuan Xiao EMAIL School of Data Science The Chinese University of Hong Kong, Shenzhen Kim-Chuan Toh EMAIL Department of Mathematics and Institute of Operations Research and Analytics National University of Singapore |
| Pseudocode | Yes | Algorithm 1 Adam with decoupled weight decay (Adam D) for nonsmooth problem (UOP). [...] Algorithm 2 Adam W (Loshchilov & Hutter, 2019). |
| Open Source Code | No | The paper does not explicitly state that source code for their methodology is released, nor does it provide a direct link to a code repository. It only mentions the implementation environment: "All experiments are conducted using an NVIDIA RTX 3090 Ti GPU and are implemented in Python 3.9 with Py Torch 1.12.0." |
| Open Datasets | Yes | Our image classification experiments include the deployment of well-established architectures, namely Resnet34 (He et al., 2016) and Densenet121 (Huang et al., 2018), to train the CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009). Our language modeling experiments focus on LSTM networks applied to the Penn Treebank dataset (Marcus et al., 1993). |
| Dataset Splits | Yes | In all our experiments on image classification, we train the models consistently for 200 epochs, employing a batch size of 128. At the 150th epoch, we reduce the step size by a factor of 0.1. [...] In all our language modeling experiments, we train our models for 200 epochs using a batch size of 128. We employ a step size reduction strategy that decreases the learning rate to 0.1 times its previous value twice during training, specifically at the 75th and 150th epochs. |
| Hardware Specification | Yes | All experiments are conducted using an NVIDIA RTX 3090 Ti GPU and are implemented in Python 3.9 with Py Torch 1.12.0. |
| Software Dependencies | Yes | All experiments are conducted using an NVIDIA RTX 3090 Ti GPU and are implemented in Python 3.9 with Py Torch 1.12.0. |
| Experiment Setup | Yes | In all our experiments on image classification, we train the models consistently for 200 epochs, employing a batch size of 128. At the 150th epoch, we reduce the step size by a factor of 0.1. [...] For the weight decay parameter, we consider values in Ï {5Ă10â3, 10â3, 5Ă10â4, 10â4}. By fixing Ï first, we ensure that all methods solve the same minimization problem. With Ï fixed, we then perform a grid search over the learning rate η for Adam D, Adam, and Adam W using η {5Ă10â5, 10â4, 5Ă10â4, 10â3, 5Ă10â3, 10â2, 5Ă10â2, 10â1}. Other parameters are set as follows: Adam/Adam W: We set Δ = 10â8, Ξk = 10â1 and ÎČ = 10â3 as the default setting in Pytorch. Adam D: We set Ξs = Ξ0 (log(s+2))â3/2, with s representing the epoch number. [...] Here, we set the initial momentum parameter to Ξ0 = 10â1, the second moment parameter to ÎČ = 10â3 and the regularization parameter to Δ = 10â8, which are the same as the default settings in Py Torch for Adam/Adam W. |