On the Iteration Complexity of Oblivious First-Order Optimization Algorithms
Authors: Yossi Arjevani, Ohad Shamir
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We consider a broad class of first-order optimization algorithms which are oblivious, in the sense that their step sizes are scheduled regardless of the function under consideration, except for limited side-information such as smoothness or strong convexity parameters. With the knowledge of these two parameters, we show that any such algorithm attains an iteration complexity lower bound of Ω( p L/ϵ) for L-smooth convex functions, and Ω( p L/µ ln(1/ϵ)) for Lsmooth µ-strongly convex functions. |
| Researcher Affiliation | Academia | Yossi Arjevani YOSSI.ARJEVANI@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot 7610001, Israel Ohad Shamir OHAD.SHAMIR@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot 7610001, Israel |
| Pseudocode | Yes | SCHEME 4.2 RESTARTING SCHEME PARAMETERS SMOOTHNESS PARAMETER L > 0 STRONG CONVEXITY PARAMETER µ > 0 CONVERGENCE PARAMETERS α > 0, C > 0 GIVEN A p-CLI OVER L-SMOOTH FUNCTIONS P WITH f(xk) f CL x0 x 2 FOR ANY INITIALIZATION VECTOR x0 ITERATE FOR t = 1, 2, . . . RESTART THE STEP SIZE SCHEDULE OF P INITIALIZE P AT x0 RUN P FOR αp 4CL/µ ITERATIONS SET x0 TO BE THE LAST ITERATE OF THIS EXECUTION END |
| Open Source Code | No | The paper is theoretical and does not mention releasing any source code for its described methodology. |
| Open Datasets | No | The paper is theoretical and does not involve empirical experiments using datasets, thus no information on dataset availability is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments, so it does not discuss dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments, and therefore does not specify hardware used. |
| Software Dependencies | No | The paper is theoretical and does not describe any software implementation details or dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any empirical experimental setup, hyperparameters, or training configurations. |