Stochastic Variance Reduction Methods for Saddle-Point Problems

Authors: Balamurugan Palaniappan, Francis Bach

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 7 Experiments. We consider binary classification problems with design matrix K and label vector in { 1, 1}n, a non-separable strongly-convex regularizer with an efficient proximal operator... We consider two datasets, sido (n = 10142, d = 4932, non-separable losses and regularizers presented above) and rcv1 (n = 20242, d = 47236, separable losses and regularizer described in Appendix F, so that we can compare with SAGA run in the primal). We report below the squared distance to optimizers which appears in our bounds, as a function of the number of passes on the data (for more details and experiments with primal-dual gaps and testing losses, see Appendix F).
Researcher Affiliation Academia P. Balamurugan INRIA Ecole Normale Supérieure, Paris balamurugan.palaniappan@inria.fr Francis Bach INRIA Ecole Normale Supérieure, Paris francis.bach@ens.fr
Pseudocode Yes Algorithm 1 SVRG: Stochastic Variance Reduction for Saddle Points
Open Source Code No The paper does not provide an explicit statement about releasing the source code for the described methodology or a direct link to a code repository.
Open Datasets Yes We consider two datasets, sido (n = 10142, d = 4932, non-separable losses and regularizers presented above) and rcv1 (n = 20242, d = 47236, separable losses and regularizer described in Appendix F, so that we can compare with SAGA run in the primal).
Dataset Splits No The paper refers to 'testing losses' and uses datasets but does not provide specific train/validation/test split percentages, sample counts, or detailed methodology for data partitioning in the main text.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup No The paper mentions general settings like 'non-uniform sampling' and regularization strength ('λ/λ0'), but lacks specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text.