Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction
Authors: Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy Nguyen
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the superior performance of our algorithms compared with previous methods in experiments on realworld datasets. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Boston University 2Khoury College of Computer and Information Science, Northeastern University. |
| Pseudocode | Yes | Algorithm 1 Ada VRAE; Algorithm 2 Ada VRAG |
| Open Source Code | No | The paper mentions using the codebase of a prior work and provides a link to that (Dubois-Taine et al., 2021), but it does not state that its own code is open-source or provide a link for its own implementation. |
| Open Datasets | Yes | We experiment with binary classification on four standard LIBSVM datasets: a1a, mushrooms, w8a and phishing (Chang & Lin, 2011). |
| Dataset Splits | No | The paper mentions training objectives and evaluation but does not specify details of training, validation, and test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments. |
| Software Dependencies | No | The paper mentions using the 'code base of (Dubois-Taine et al., 2021)' but does not list specific software dependencies with version numbers for its own implementation. |
| Experiment Setup | Yes | For the non-adaptive methods we chose the step size (or equivalently, the inverse of the smoothness parameter (1/β) for VRADA) via hyperparameter search over {0.01, 0.05, 0.1, 0.5, 1, 5, 10, 100}. For Ada SVRG, we used η = D/2R as recommended in the original paper. For Ada VRAE and Ada VRAG, we used γ = 0.01 and η = D/2 = R. Table 2 reports the hyperparameter choices used in the experiments. |