Random Reshuffling is Not Always Better
Authors: Christopher M. De Sa
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Running Algorithm 1 on this example produces the results shown in Figure 1. This shows empirically that, counter-intuitively, standard with-replacement random sampling can outperform random reshuffling for this algorithm. We explore this task empirically in Figure 3, where we ran a thousand epochs of SGD using both with- and without-replacement sampling on the example task we constructed in this section. |
| Researcher Affiliation | Academia | Christopher De Sa Department of Computer Science Cornell University cdesa@cs.cornell.edu |
| Pseudocode | Yes | Algorithm 1 Parallel SGD |
| Open Source Code | No | The paper does not provide a specific repository link or an explicit statement about the release of the source code for the methodology described in the paper. |
| Open Datasets | No | The paper constructs a specific 'matrix-completion-like task' and defines parameters (ui, vi, ai) for it, stating 'we pick n = 40, a constant step size α = 0.1, γ = 0.05, M = 1000, K = 100, ui = 1, vi = yi (the yi of Section 2), and ai = 1 αγ 2α'. This is a custom-generated dataset for the experiment, and no public access information is provided. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Concretely, we pick n = 40, a constant step size α = 0.1, γ = 0.05, M = 1000, K = 100, ui = 1, vi = yi (the yi of Section 2), and ai = 1 αγ 2α ; we initialize w0 randomly such that w0 = 1. |