reproducibilityindex.ai

GraB: Finding Provably Better Data Permutations than Random Reshuffling

Authors: Yucheng Lu, Wentao Guo, Christopher M. De Sa

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically1 on applications including MNIST, CIFAR10, Wiki Text and GLUE that Gra B can outperform random reshufﬂing in terms of both training and validation performance, and even outperform state-of-the-art greedy ordering while reducing memory usage over 100 .
Researcher Affiliation	Academia	Yucheng Lu, Wentao Guo, Christopher De Sa Department of Computer Science Cornell University {yl2967, wg247, cmd353}@cornell.edu
Pseudocode	Yes	Algorithm 1 Herding with Greedy Ordering
Open Source Code	Yes	The experimental code is available at https://github.com/Eugene LYC/Gra B.
Open Datasets	Yes	We show empirically1 on applications including MNIST, CIFAR10, Wiki Text and GLUE that Gra B can outperform random reshufﬂing in terms of both training and validation performance, and even outperform state-of-the-art greedy ordering while reducing memory usage over 100 .
Dataset Splits	No	The paper mentions using training and validation data, but does not specify explicit dataset split percentages, sample counts, or refer to predefined splits in the main text.
Hardware Specification	Yes	All the experiments run on an instance conﬁgured with a 4-core Intel(R) Xeon(R) 2.50GHz CPU, 32GB memory and an NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies	No	The paper mentions 'Py Torch' as an example of an ML library, but does not provide specific version numbers for any software dependencies.
Experiment Setup	No	Detailed information regarding models, datasets and hyperparameters can be found in Appendix A.