Time-Reversed Dissipation Induces Duality Between Minimizing Gradient Norm and Function Value

Authors: Jaeyeon Kim, Asuman Ozdaglar, Chanwoo Park, Ernest Ryu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we present H-duality, which represents a surprising one-to-one correspondence between methods efficiently minimizing function values and methods efficiently minimizing gradient magnitude. In continuous-time formulations, H-duality corresponds to reversing the time dependence of the dissipation/friction term. To the best of our knowledge, H-duality is different from Lagrange/Fenchel duality and is distinct from any previously known duality or symmetry relations. Using H-duality, we obtain a clearer understanding of the symmetry between Nesterov’s method and OGM-G, derive a new class of methods efficiently reducing gradient magnitudes of smooth convex functions, and find a new composite minimization method that is simpler and faster than FISTA-G.
Researcher Affiliation Academia Jaeyeon Kim Seoul National University kjy011102@snu.ac.kr Asuman Ozdaglar MIT EECS asuman@mit.edu Chanwoo Park MIT EECS cpark97@mit.edu Ernest K. Ryu Seoul National University ernestryu@snu.ac.kr
Pseudocode No The paper presents algorithms like (OGM), (OGM-G), and (SFG) using mathematical equations and variable definitions, but these are integrated into the text and not formatted as distinct pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statements about releasing open-source code or links to a code repository.
Open Datasets No The paper is theoretical and does not describe experiments involving datasets for training.
Dataset Splits No The paper is theoretical and does not describe experiments involving dataset splits for validation.
Hardware Specification No The paper is theoretical and does not mention any hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations.