Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities

Authors: Alina Ene, Huy L. Nguyen, Adrian Vladu7314-7321

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental Evaluation To empirically validate the ADAACSA and ADAAGD+ algorithms, we test them on a series of standard models encountered in machine learning. While the analyses we provided are specifically crafted for convex objectives, we see that these methods exhibit good behavior in the non-convex settings corresponding to training deep learning models. This may be motivated by the fact that a significant part of the optimization performed when training such models occurs within convex regions (Leclerc and Madry 2020).
Researcher Affiliation Academia 1 Department of Computer Science, Boston University 2 Khoury College of Computer and Information Science, Northeastern University 3 CNRS & IRIF, Universit e de Paris aene@bu.edu, hu.nguyen@northeastern.edu, vladu@irif.fr
Pseudocode Yes Figure 1: ADAGRAD+ algorithm. Figure 2: ADAACSA algorithm. Figure 3: ADAAGD+ algorithm. Figure 4: Adaptive Mirror-Prox algorithm, extending (Bach and Levy 2019) to the vector setting.
Open Source Code No The paper does not provide a direct link or explicit statement about the availability of its source code for the described methodology. It mentions: "We give the complete experimental details in the full version (Ene, Nguyen, and Vladu 2021)." which points to an arXiv preprint, not a code repository.
Open Datasets Yes Classification experiments: Additionally, we tested these optimization methods on three different classification models typically encountered in machine learning. The first one is logistic regression on the MNIST dataset. ... The second is a convolutional neural network on the MNIST dataset. ... The third is a residual network on the CIFAR-10 dataset.
Dataset Splits No The paper mentions datasets like MNIST and CIFAR-10, but it does not explicitly provide information about how the datasets were split into training, validation, and test sets (e.g., specific percentages, sample counts, or cross-validation schemes). While standard splits are common for these datasets, the paper does not explicitly state them.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or cloud computing specifications).
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No The paper states: "We performed extensive hyper-parameter tuning, such that each method we compare against has the opportunity to exhibit its best possible performance." However, it does not explicitly list the specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or other training configurations used for the experiments in the main text.