Dissipativity Theory for Nesterov’s Accelerated Method

Authors: Bin Hu, Laurent Lessard

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we adapt the control theoretic concept of dissipativity theory to provide a natural understanding of Nesterov s accelerated method. Our theory ties rigorous convergence rate analysis to the physically intuitive notion of energy dissipation. Moreover, dissipativity allows one to efficiently construct Lyapunov functions (either numerically or analytically) by solving a small semidefinite program. Using novel supply rate functions, we show how to recover known rate bounds for Nesterov s method and we generalize the approach to certify both linear and sublinear rates in a variety of settings. Finally, we link the continuous-time version of dissipativity to recent works on algorithm analysis that use discretizations of ordinary differential equations.
Researcher Affiliation Academia 1University of Wisconsin Madison, Madison, WI 53706, USA. Correspondence to: Bin Hu <bhu38@wisc.edu>.
Pseudocode No The paper describes algorithms (Nesterov's accelerated method, gradient descent) using mathematical equations, but does not include structured pseudocode or an algorithm block.
Open Source Code No The paper does not provide any statements about releasing source code or links to a code repository for the methodology described.
Open Datasets No This paper is theoretical and does not use datasets for training or evaluation.
Dataset Splits No This paper is theoretical and does not involve dataset splits for validation.
Hardware Specification No This paper is theoretical and does not describe experiments, thus no hardware specifications are provided.
Software Dependencies No This paper is theoretical and does not describe experiments, thus no software dependencies with version numbers are provided.
Experiment Setup No This paper is theoretical and does not describe experimental setups or hyperparameters.