Convex and Non-convex Optimization Under Generalized Smoothness

Authors: Haochuan Li, Jian Qian, Yi Tian, Alexander Rakhlin, Ali Jadbabaie

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we further generalize this non-uniform smoothness condition and develop a simple, yet powerful analysis technique that bounds the gradients along the trajectory, thereby leading to stronger results for both convex and non-convex optimization problems. In particular, we obtain the classical convergence rates for (stochastic) gradient descent and Nesterov s accelerated gradient method in the convex and/or non-convex setting under this general smoothness condition.
Researcher Affiliation Academia Haochuan Li MIT haochuan@mit.edu Jian Qian* MIT jianqian@mit.edu Yi Tian MIT yitian@mit.edu Alexander Rakhlin MIT rakhlin@mit.edu Ali Jadbabaie MIT jadbabai@mit.edu
Pseudocode Yes Algorithm 1: Nesterov s Accelerated Gradient Method (NAG) and Algorithm 2: NAG for ยต-strongly-convex functions
Open Source Code No The paper does not contain any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets No This is a theoretical paper that does not perform empirical evaluations using datasets. Therefore, it does not discuss dataset availability for training.
Dataset Splits No This is a theoretical paper that does not perform empirical evaluations using datasets. Therefore, it does not provide details on training/validation/test splits.
Hardware Specification No This is a theoretical paper presenting mathematical analysis and proofs of convergence rates. It does not describe any computational experiments or hardware specifications.
Software Dependencies No This is a theoretical paper focusing on mathematical analysis and algorithm design (pseudocode). It does not mention any specific software dependencies or their versions.
Experiment Setup No This is a theoretical paper that does not conduct experiments. Therefore, it does not describe experimental setup details such as hyperparameters or training settings.