Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction

Authors: Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy Nguyen

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the superior performance of our algorithms compared with previous methods in experiments on realworld datasets.
Researcher Affiliation Academia 1Department of Computer Science, Boston University 2Khoury College of Computer and Information Science, Northeastern University.
Pseudocode Yes Algorithm 1 Ada VRAE; Algorithm 2 Ada VRAG
Open Source Code No The paper mentions using the codebase of a prior work and provides a link to that (Dubois-Taine et al., 2021), but it does not state that its own code is open-source or provide a link for its own implementation.
Open Datasets Yes We experiment with binary classification on four standard LIBSVM datasets: a1a, mushrooms, w8a and phishing (Chang & Lin, 2011).
Dataset Splits No The paper mentions training objectives and evaluation but does not specify details of training, validation, and test splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies No The paper mentions using the 'code base of (Dubois-Taine et al., 2021)' but does not list specific software dependencies with version numbers for its own implementation.
Experiment Setup Yes For the non-adaptive methods we chose the step size (or equivalently, the inverse of the smoothness parameter (1/β) for VRADA) via hyperparameter search over {0.01, 0.05, 0.1, 0.5, 1, 5, 10, 100}. For Ada SVRG, we used η = D/2R as recommended in the original paper. For Ada VRAE and Ada VRAG, we used γ = 0.01 and η = D/2 = R. Table 2 reports the hyperparameter choices used in the experiments.