Acceleration of SVRG and Katyusha X by Inexact Preconditioning

Authors: Yanli Liu, Fei Feng, Wotao Yin

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our numerical experiments, we observe an on average 8 speedup on the number of iterations and 7 speedup on runtime. 5. Experiments To investigate the practical performance of Algorithms 1 and 2, we test on three problems: Lasso, logistic regression, and a synthetic sum-of-nonconvex problem. In the following, we compare SVRG, i Pre SVRG, Katyusha X, and i Pre Kat X on four datasets from LIBSVM1: w1a.t (47272 samples, 300 features), protein (17766 samples, 357 features), cod-rna.t (271617 samples, 8 features), australian (690 samples, 14 features), and one synthetic dataset. Our numerical results are presented in the following figures.
Researcher Affiliation Academia Yanli Liu 1 Fei Feng 1 Wotao Yin 1 1Department of Mathematics, University of California, Los Angeles, Los Angeles, CA, USA. Correspondence to: Yanli Liu <yanli@math.ucla.edu>.
Pseudocode Yes Algorithm 1 Inexact Preconditioned SVRG(i Pre SVRG) Algorithm 2 Inexact Preconditioned Katyusha X(i Pre Kat X) Procedure 1 Procedure for solving (3.2) inexactly Algorithm 3 FISTA with restart for solving (3.2)
Open Source Code Yes The code is available at: https://github.com/uclaopt/IPSVRG.
Open Datasets Yes We compare SVRG, i Pre SVRG, Katyusha X, and i Pre Kat X on four datasets from LIBSVM1: w1a.t (47272 samples, 300 features), protein (17766 samples, 357 features), cod-rna.t (271617 samples, 8 features), australian (690 samples, 14 features), and one synthetic dataset. 1https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/
Dataset Splits No The paper discusses epoch length and general experimental settings but does not explicitly mention train/validation/test splits, specific sample counts for splits, or cross-validation methodology.
Hardware Specification Yes The experiments are conducted on a Windows system with Intel Core i7 2.6 GHz CPU.
Software Dependencies Yes All algorithms are implemented in Matlab R2015b.
Experiment Setup Yes 1. We choose the epoch length m = 100 in all experiments, since we found that the choices m { n 2 , n} need more gradient evaluations. 2. For i Pre PDHG and i Pre Kat X, we use FISTA as the subproblem iterator S. If the preconditioner M is diagonal, then the number of subroutines for solving the subproblem is p = 1, if not, then we set p = 20. 3. In all the experiments, we tune the step size η and momentum weight τ to their optimal. 4. All algorithms are initialized at x0 = 0.