Variance Reduction for Faster Non-Convex Optimization

Authors: Zeyuan Allen-Zhu, Elad Hazan

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our methods on empirical risk minimizations with non-convex loss functions and training neural nets. and 6 Experiments
Researcher Affiliation Academia Zeyuan Allen-Zhu ZEYUAN@CSAIL.MIT.EDU Princeton University Elad Hazan EHAZAN@CS.PRINCETON.EDU Princeton University
Pseudocode Yes Algorithm 1 Simplified SVRG method in the non-convex setting
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for their described methodology is openly available.
Open Datasets Yes We consider binary classification on four standard datasets that can be found on the Lib SVM website (Fan & Lin): the adult (a9a) dataset... the web (w8a) dataset... the rcv1 (rcv1.binary) dataset... the mnist (class 1) dataset. and We consider the multi-class (in fact, 10-class) classification problem on CIFAR-10 (60, 000 training samples) and MNIST (10, 000 training samples), two standard image datasets for neural net studies.
Dataset Splits Yes for each of the 12 datasets, we partition the training samples randomly into a training set of size 4/5 and a validation set of size 1/5.
Hardware Specification No The paper mentions 'GPU-based running time' but does not specify any particular hardware models (CPU, GPU, or memory) used for the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers for its experimental setup or implementation.
Experiment Setup Yes We choose epoch length m = 2n as suggested by the paper SVRG for ERM experiments, and use the simple Algorithm 1 for both convex and non-convex loss functions. and We choose a minibatch size of 100 for both these methods.