reproducibilityindex.ai

A simpler approach to accelerated optimization: iterative averaging meets optimism

Authors: Pooria Joulani, Anant Raj, Andras Gyorgy, Csaba Szepesvari

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we show that there is a simpler approach to acceleration: applying optimistic online learning algorithms and querying the gradient oracle at the online average of the intermediate optimization iterates. In particular, we tighten a recent result of Cutkosky (2019) to demonstrate theoretically that online iterate averaging results in a reduced optimization gap, independently of the algorithm involved. We show that carefully combining this technique with existing generic optimistic online learning algorithms yields the optimal accelerated rates for optimizing strongly-convex and non-strongly-convex, possibly composite objectives, with deterministic as well as stochastic ﬁrst-order oracles. We further extend this idea to variance-reduced optimization. Finally, we also provide universal algorithms that achieve the optimal rate for smooth and non-smooth composite objectives simultaneously without further tuning, generalizing the results of Kavis et al. (2019) and solving a number of their open problems.
Researcher Affiliation	Collaboration	1Deep Mind, London, UK 2Max-Planck Institute for Intelligent Systems, T ubingen, Germany; work done during an internship at Deepmind 3University of Alberta, Edmonton, AB, Canada.
Pseudocode	Yes	Algorithm 1 Vanilla Online-to-Batch, Algorithm 2 Anytime Online-to-Batch (Cutkosky, 2019), Algorithm 3 Variance-Reduced Anytime Online-to-Batch with Negative Momentum
Open Source Code	No	No explicit statement about providing open-source code or a link to a code repository was found.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and convergence proofs. It does not mention the use of any specific public or open datasets for training, nor does it provide access information for such.
Dataset Splits	No	The paper is theoretical and does not conduct experiments, therefore, it does not specify training, validation, or test dataset splits.
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, processors, or cloud resources) used for running experiments were mentioned in the paper.
Software Dependencies	No	No specific software dependencies with version numbers were mentioned that would be necessary to replicate any computational work.
Experiment Setup	No	The paper is theoretical and does not describe experimental setup details such as hyperparameter values, model initialization, or training configurations.