Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Authors: Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse Dadi, Volkan Cevher
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We complement our theoretical findings with an evaluation of the numerical performance of the algorithm under different experimental setups. We aim to highlight the sample complexity improvements over simple stochastic methods, while displaying the advantages of adaptive step-size strategies. For that purpose we design two setups; first, we consider the minimization of a convex loss with a non-convex regularizer in the sense of Wang et al. [51] and in a second part we consider an image classification task with neural networks. |
| Researcher Affiliation | Academia | Ali Kavis LIONS, EPFL ali.kavis@epfl.ch; Stratis Skoulakis LIONS, EPFL efstratios.skoulakis@epfl.ch; Kimon Antonakopoulos LIONS, EPFL kimon.antonakopoulos@epfl.ch; Leello Tadesse Dadi LIONS, EPFL leello.dadi@epfl.ch; Volkan Cevher LIONS, EPFL volkan.cevher@epfl.ch |
| Pseudocode | Yes | Algorithm 1 Adaptive SPIDER (ADASPIDER) Input: x0 Rd, β0 > 0, G0 > 0 1: G 0 2: for t = 0, ..., T 1 do 3: if t mod n = 0 then 4: t f(xt) 5: else 6: pick it {1, . . . , n} uniformly at random 7: t fit(xt) fit(xt 1) + t 1 8: end if 9: γt 1/ n1/4β0 q n1/2G2 0 + Pt s=0 s 2 10: xt+1 xt γt t 11: end for 12: return uniformly at random {x0, . . . , x T 1}. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] |
| Open Datasets | Yes | We picked two datasets from Lib SVM, namely a1a, mushrooms. ... we test our algorithm on two benchmark datasets : MNIST[30] and Fashion MNIST[54]. |
| Dataset Splits | No | The paper states the use of datasets like MNIST and Fashion MNIST and reports 'Test Accuracy' in Table 2. However, it does not provide explicit details about the split percentages or methodology used for training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications. |
| Software Dependencies | No | The paper refers to various algorithms and frameworks such as 'Lib SVM', 'PyTorch' (implicitly by neural network training), 'Katyusha Xw', and 'SVRG', but it does not specify any software dependencies with their version numbers. |
| Experiment Setup | Yes | Table 2 explicitly lists 'Algorithm Parameters' including 'Batch Size = 32, cinit = 0.03' for MNIST and 'Batch Size = 128, cinit = 0.01' for Fashion MNIST. It also specifies parameters for other algorithms like 'Ada Grad[16] η = 0.01, ϵ = 10 4' and 'SGD[46] η = 0.01'. |