Convex and Non-convex Optimization Under Generalized Smoothness
Authors: Haochuan Li, Jian Qian, Yi Tian, Alexander Rakhlin, Ali Jadbabaie
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we further generalize this non-uniform smoothness condition and develop a simple, yet powerful analysis technique that bounds the gradients along the trajectory, thereby leading to stronger results for both convex and non-convex optimization problems. In particular, we obtain the classical convergence rates for (stochastic) gradient descent and Nesterov s accelerated gradient method in the convex and/or non-convex setting under this general smoothness condition. |
| Researcher Affiliation | Academia | Haochuan Li MIT haochuan@mit.edu Jian Qian* MIT jianqian@mit.edu Yi Tian MIT yitian@mit.edu Alexander Rakhlin MIT rakhlin@mit.edu Ali Jadbabaie MIT jadbabai@mit.edu |
| Pseudocode | Yes | Algorithm 1: Nesterov s Accelerated Gradient Method (NAG) and Algorithm 2: NAG for ยต-strongly-convex functions |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to a code repository for the methodology described. |
| Open Datasets | No | This is a theoretical paper that does not perform empirical evaluations using datasets. Therefore, it does not discuss dataset availability for training. |
| Dataset Splits | No | This is a theoretical paper that does not perform empirical evaluations using datasets. Therefore, it does not provide details on training/validation/test splits. |
| Hardware Specification | No | This is a theoretical paper presenting mathematical analysis and proofs of convergence rates. It does not describe any computational experiments or hardware specifications. |
| Software Dependencies | No | This is a theoretical paper focusing on mathematical analysis and algorithm design (pseudocode). It does not mention any specific software dependencies or their versions. |
| Experiment Setup | No | This is a theoretical paper that does not conduct experiments. Therefore, it does not describe experimental setup details such as hyperparameters or training settings. |