Adaptive Averaging in Accelerated Descent Dynamics

Authors: Walid Krichene, Alexandre Bayen, Peter L. Bartlett

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide guarantees on adaptive averaging in continuous-time, prove that it preserves the quadratic convergence rate of accelerated first-order methods in discrete-time, and give numerical experiments to compare it with existing heuristics, such as adaptive restarting. The experiments indicate that adaptive averaging performs at least as well as adaptive restarting, with significant improvements in some cases.
Researcher Affiliation Collaboration Walid Krichene UC Berkeley walid@eecs.berkeley.edu Alexandre M. Bayen UC Berkeley bayen@berkeley.edu Peter L. Bartlett UC Berkeley and QUT bartlett@cs.berkeley.edu Walid Krichene is currently affiliated with Google. walidk@google.com
Pseudocode Yes Algorithm 1 Accelerated mirror descent with adaptive averaging
Open Source Code No The paper does not provide any specific links or explicit statements about the availability of its source code.
Open Datasets No The paper describes the objective functions used (e.g., strongly convex quadratic, linear function, Kullback-Leibler divergence) and the feasible sets (e.g., simplex, positive orthant), which are mathematical constructs rather than standard publicly available datasets with specific access information. No links or citations for datasets are provided.
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages, sample counts, or citations to predefined splits) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup No While the paper describes the types of objective functions used and compares different heuristics, it does not provide concrete hyperparameter values or detailed system-level training configurations such as step sizes used in the discrete algorithm for the reported experiments.