Local Convergence Properties of SAGA/Prox-SVRG and Acceleration
Authors: Clarice Poon, Jingwei Liang, Carola Schoenlieb
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Several concrete examples arising from machine learning are considered to demonstrate the obtained result. and 6. Numerical Experiments We now consider several examples to verify the established results. |
| Researcher Affiliation | Academia | 1DAMTP, University of Cambridge, Cambridge, United Kingdom. |
| Pseudocode | Yes | SAGA Algorithm (Defazio et al., 2014) The key idea of SAGA for reducing the variance is utilising the gradient history for the evaluation of the current gradient. Given an initial point x0, define the individual gradient g0,i def= fi(x0), i = 1, , m. Then, for k = 0, 1, 2, 3, $ sample ik uniformly from {1, , m}, wk = xk γk( fik(xk) gk,ik + 1 Pm i=1 gk,i), xk+1 = proxγk R(wk), ( fi(xk) if i = ik, gk 1,i o.w. (6) and Prox-SVRG Algorithm (Xiao & Zhang, 2014) Compared to SAGA, in stead of using the gradient history, Prox-SVRG computes the full gradient of a given point, and uses it for a certain number of iterations. Let P be a positive integer, for ℓ= 0, 1, 2, Pm i=1 fi( xℓ), xℓ,0 = xℓ, For p = 1, , P sample ip uniformly from {1, , m}, wk = xℓ,p 1 γk( fip(xℓ,p 1) fip( xℓ) + gℓ), xℓ,p = proxγk R(wk). Option I : xℓ+1 = xℓ,P , Option II : xℓ+1 = 1 PP p=1 xℓ,p. (8) |
| Open Source Code | Yes | Lastly, for the numerical experiments considered in this paper, the corresponding MATLAB source code to reproduce the results is available online2. and 2https://github.com/jliang993/Local-VRSGD |
| Open Datasets | No | The paper uses synthetic data for its main experiments, specifically stating in Example 6.1: 'Let m > 0 and (zi, yi) Rn { 1}, i = 1, , m be the training set.' No explicit access information (link, DOI, citation) for a publicly available dataset is provided for the data used in the main paper's experiments. It mentions 'experiments on large scale real data are presented' in supplementary material, but no details are given in the main paper. |
| Dataset Splits | No | The paper describes synthetic data generation for training but does not specify explicit training/validation/test dataset splits, percentages, or cross-validation setup. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'MATLAB source code' but does not specify versions for MATLAB or any other software dependencies, libraries, or solvers used in the experiments. |
| Experiment Setup | Yes | The setting of the experiment is: n = 256, m = 128, µ = 1/ m and L = 1188. The parameters choices of SAGA and Prox-SVRG are: SAGA : γ = 1 / 2L; Prox-SVRG : γ = 1 / 3L; P = m; µ = 1/m;. And for Example 6.3: ℓ1,2-norm: (m, n) = (256, 512), xob has 8 non-zero blocks of block-size 4; Nuclear norm: (m, n) = (2048, 4096), rank(xob) = 4. |