Random Reshuffling is Not Always Better

Authors: Christopher M. De Sa

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Running Algorithm 1 on this example produces the results shown in Figure 1. This shows empirically that, counter-intuitively, standard with-replacement random sampling can outperform random reshuffling for this algorithm. We explore this task empirically in Figure 3, where we ran a thousand epochs of SGD using both with- and without-replacement sampling on the example task we constructed in this section.
Researcher Affiliation Academia Christopher De Sa Department of Computer Science Cornell University cdesa@cs.cornell.edu
Pseudocode Yes Algorithm 1 Parallel SGD
Open Source Code No The paper does not provide a specific repository link or an explicit statement about the release of the source code for the methodology described in the paper.
Open Datasets No The paper constructs a specific 'matrix-completion-like task' and defines parameters (ui, vi, ai) for it, stating 'we pick n = 40, a constant step size α = 0.1, γ = 0.05, M = 1000, K = 100, ui = 1, vi = yi (the yi of Section 2), and ai = 1 αγ 2α'. This is a custom-generated dataset for the experiment, and no public access information is provided.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes Concretely, we pick n = 40, a constant step size α = 0.1, γ = 0.05, M = 1000, K = 100, ui = 1, vi = yi (the yi of Section 2), and ai = 1 αγ 2α ; we initialize w0 randomly such that w0 = 1.