Without-Replacement Sampling for Stochastic Gradient Methods

Authors: Ohad Shamir

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we provide competitive convergence guarantees for without-replacement sampling under several scenarios, focusing on the natural regime of few passes over the data. Moreover, we describe a useful application of these results in the context of distributed optimization with randomly-partitioned data, yielding a nearly-optimal algorithm for regularized least squares (in terms of both communication complexity and runtime complexity) under broad parameter regimes. Our proof techniques combine ideas from stochastic optimization, adversarial online learning and transductive learning theory, and can potentially be applied to other stochastic optimization and learning problems.
Researcher Affiliation Academia Ohad Shamir Department of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel ohad.shamir@weizmann.ac.il
Pseudocode Yes Algorithm 1 SVRG using Without-Replacement Sampling
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide links to a code repository.
Open Datasets No The paper focuses on theoretical analysis of optimization algorithms and does not describe or use specific public datasets for empirical training or evaluation.
Dataset Splits No The paper is theoretical and does not conduct experiments with datasets, thus it does not provide details on training, validation, or test splits.
Hardware Specification No The paper is theoretical and does not report on empirical experiments, therefore no specific hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe any specific software dependencies with version numbers required for reproducibility.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with specific hyperparameter values or training configurations for empirical evaluation.