Faster Margin Maximization Rates for Generic Optimization Methods

Authors: Guanghui Wang, Zihao Hu, Vidya Muthukumar, Jacob D. Abernethy

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical To address this limitation, in this paper, we present a series of stateof-the-art implicit bias rates for mirror descent and steepest descent algorithms. Our primary technique involves transforming a generic optimization algorithm into an online learning dynamic that solves a regularized bilinear game, providing a unified framework for analyzing the implicit bias of various optimization methods. The accelerated rates are derived leveraging the regret bounds of online learning algorithms within this game framework.
Researcher Affiliation Collaboration Guanghui Wang1, Zihao Hu1, Vidya Muthukumar2,3, Jacob Abernethy1,4 1College of Computing, Georgia Institute of Technology 2School of Electrical and Computer Engineering, Georgia Institute of Technology 3School of Industrial and Systems Engineering, Georgia Institute of Technology 4Google Research, Atalanta {gwang369,zihaohu,vmuthukumar8}@gatech.edu, abernethyj@google.com
Pseudocode Yes Algorithm 1 Mirror Descent [Recall ℓt(p) = g(p, wt), and ht(w) = g(pt, w)]
Open Source Code No The paper does not provide a statement about open-source code availability or a link to a code repository for the methodology described in this paper.
Open Datasets No The paper defines a 'Basic setting' with 'a set of n data points S = {(x(i), y(i))}' and makes assumptions about its properties ('linearly separable and bounded'). However, it does not refer to a specific, named publicly available dataset with concrete access information (link, DOI, citation).
Dataset Splits No The paper is theoretical and does not describe experimental dataset splits (training, validation, test).
Hardware Specification No The paper does not describe any specific hardware used to run experiments.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not detail any experimental setup, such as hyperparameters or training settings.