GraB: Finding Provably Better Data Permutations than Random Reshuffling
Authors: Yucheng Lu, Wentao Guo, Christopher M. De Sa
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically1 on applications including MNIST, CIFAR10, Wiki Text and GLUE that Gra B can outperform random reshuffling in terms of both training and validation performance, and even outperform state-of-the-art greedy ordering while reducing memory usage over 100 . |
| Researcher Affiliation | Academia | Yucheng Lu, Wentao Guo, Christopher De Sa Department of Computer Science Cornell University {yl2967, wg247, cmd353}@cornell.edu |
| Pseudocode | Yes | Algorithm 1 Herding with Greedy Ordering |
| Open Source Code | Yes | The experimental code is available at https://github.com/Eugene LYC/Gra B. |
| Open Datasets | Yes | We show empirically1 on applications including MNIST, CIFAR10, Wiki Text and GLUE that Gra B can outperform random reshuffling in terms of both training and validation performance, and even outperform state-of-the-art greedy ordering while reducing memory usage over 100 . |
| Dataset Splits | No | The paper mentions using training and validation data, but does not specify explicit dataset split percentages, sample counts, or refer to predefined splits in the main text. |
| Hardware Specification | Yes | All the experiments run on an instance configured with a 4-core Intel(R) Xeon(R) 2.50GHz CPU, 32GB memory and an NVIDIA Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' as an example of an ML library, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | Detailed information regarding models, datasets and hyperparameters can be found in Appendix A. |