reproducibilityindex.ai

Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning

Authors: Patrik Okanovic, Roger Waleffe, Vasilis Mageirakos, Konstantinos Nikolakakis, Amin Karbasi, Dionysios Kalogerias, Nezihe Merve Gürel, Theodoros Rekatsinas

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test RS2 against thirty-two state-of-the-art data pruning and distillation methods across four datasets including Image Net. Our results demonstrate that RS2 significantly reduces time-to-accuracy, particularly in practical regimes where accuracy, but not runtime, is similar to that of training on full dataset.
Researcher Affiliation	Collaboration	1ETH Zürich 2University of Wisconsin-Madison 3Yale 4Google Research 5TU Delft
Pseudocode	Yes	Algorithm 1 RS2 General Algorithm
Open Source Code	Yes	Source code: https://github.com/PatrikOkanovic/RS2
Open Datasets	Yes	We benchmark RS2 against baseline methods using CIFAR10 (Krizhevsky et al., 2009), CIFAR100 (Krizhevsky et al., 2009), Image Net30 (a subset of Image Net) (Hendrycks et al., 2019), and Image Net (Russakovsky et al., 2015) itself.
Dataset Splits	No	The paper uses standard public datasets like CIFAR10 and ImageNet, which have predefined splits, but it does not explicitly state the training, validation, or test split percentages or sample counts within the paper. It refers to a 'normal test set' but no full split breakdown.
Hardware Specification	Yes	We train all methods from scratch on NVIDIA 3090 GPUs and use all baselines which do not give GPU out-of-memory. For the experiments reported here, we run baseline as they were originally proposed (i.e., with static subset selection). This allows us to quantify the overhead of selecting a single subset with existing methods compared to repeatedly selecting many random subsets with RS2. We show the time-to-accuracy on CIFAR10 in Figure 3a and on Image Net in Figure 3b using r = 10% for both datasets. We run GPT2 experiments using AWS P3 GPU instances with eight NVIDIA V100 GPUs (as GPT2 experiments require more compute power).
Software Dependencies	No	The paper mentions using SGD as an optimizer and implies common deep learning frameworks but does not list specific software dependencies with version numbers.
Experiment Setup	Yes	For CIFAR10 and CIFAR100 experiments, we use SGD as the optimizer with batch size 128, initial learning rate 0.1, a cosine decay learning rate schedule (Loshchilov & Hutter, 2016), momentum 0.9, weight decay 0.0005, and 200 training epochs.