Online Adaptive Methods, Universality and Acceleration
Authors: Kfir Y. Levy, Alp Yurtsever, Volkan Cevher
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical examination of our method demonstrates its applicability to the above mentioned scenarios and corroborates our theoretical findings. ... In Section 5 we present our empirical study ... Figure 1: Comparison of universal methods at a smooth (top) and a non-smooth (bottom) problem. |
| Researcher Affiliation | Academia | Kfir Y. Levy ETH Zurich yehuda.levy@inf.ethz.ch Alp Yurtsever EPFL alp.yurtsever@epfl.ch Volkan Cevher EPFL volkan.cevher@epfl.ch |
| Pseudocode | Yes | Algorithm 1 Adaptive Gradient Method (Ada Grad) ... Algorithm 2 Accelerated Adaptive Gradient Method (Accele Grad) |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | No | We synthetically generate matrix A Rn d and a point of interest x Rd randomly, with entries independently drawn from standard Gaussian distribution. ... In the appendix we show results on a real dataset which demonstrate the appeal of Accele Grad in the large-minibatch regime. (No specific dataset name, link, or citation for public access is provided for the real dataset, and the primary data is synthetic.) |
| Dataset Splits | No | The paper does not explicitly describe training, validation, and test dataset splits with percentages, sample counts, or citations to predefined splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory, or specific machine types). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, etc.). |
| Experiment Setup | Yes | All methods are initialized at the origin, and we choose K as the ℓ2 norm ball of diameter D. ... The parameter ρ denotes the ratio between D/2 and the distance between initial point and the solution. Parameter D plays a major role on the step-size of Ada Grad and Accele Grad. |