reproducibilityindex.ai

Natasha 2: Faster Non-Convex Optimization Than SGD

Authors: Zeyuan Allen-Zhu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We design a stochastic algorithm to ﬁnd ε-approximate local minima of any smooth nonconvex function in rate O(ε 3.25), with only oracle access to stochastic gradients. The best result before this work was O(ε 4) by stochastic gradient descent (SGD). Theorem 1 (informal). Natasha1.5 ﬁnds x with f(x) ε in rate T = O 1. Theorem 2 (informal). Natasha2 ﬁnds x with f(x) ε and 2f(x) δI in rate T = e O 1 δ5 + 1 δε3 + 1 ε3.25 . In contrast, the convergence rate of SGD was T = e O(poly(d) ε 4) [22]. Table 1: Comparison of online methods for ﬁnding f(x) ε.
Researcher Affiliation	Industry	Zeyuan Allen-Zhu Microsoft Research AI zeyuan@csail.mit.edu
Pseudocode	Yes	Algorithm 1 Natasha1.5(F, x , B, T , α) Input: f( ) = 1 n Pn i=1 fi(x), starting vector x , epoch length B [n], epoch count T 1, learning rate α > 0.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The paper is theoretical and does not mention using any datasets for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not involve dataset splits for validation or other purposes.
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies	No	The paper is theoretical and does not provide specific ancillary software details with version numbers.
Experiment Setup	No	The paper is theoretical and does not contain specific experimental setup details, hyperparameters, or training configurations.