Tuning-Free Stochastic Optimization
Authors: Ahmed Khaled, Chi Jin
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that for the task of minimizing a convex and smooth or Lipschitz function over an unbounded domain, tuning-free optimization is impossible. We discuss conditions under which tuning-free optimization is possible even over unbounded domains. In particular, we show that the recently proposed Do G and Do WG algorithms are tuning-free when the noise distribution is sufficiently well-behaved. For the task of finding a stationary point of a smooth and potentially nonconvex function, we give a variant of SGD that matches the best-known high-probability convergence rate for tuned SGD at only an additional polylogarithmic cost. However, we also give an impossibility result that shows no algorithm can hope to match the optimal expected convergence rate for tuned SGD with high probability. |
| Researcher Affiliation | Academia | Ahmed Khaled 1 Chi Jin 1 1Electrical and Computer Engineering Department, Princeton University, Princeton, NJ, USA. |
| Pseudocode | Yes | Algorithm 1 Restarted SGD, Algorithm 2 T-Do G + Variance Estimation, Algorithm 3 T-Do WG + Variance Estimation, Algorithm 4 Find Leader(S, δ, K) |
| Open Source Code | No | The paper does not include any statement or link indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving datasets. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments, therefore no dataset splits for validation are mentioned. |
| Hardware Specification | No | The paper is theoretical and does not report experimental results that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not report experimental results that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or training configurations. |