Provable Data Subset Selection For Efficient Neural Networks Training

Authors: Murad Tukan, Samson Zhou, Alaa Maalouf, Daniela Rus, Vladimir Braverman, Dan Feldman

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we practically demonstrate the efficiency and stability of our RBFNN coreset approach for training deep neural networks via data subset selection. We mainly study the trade-off between accuracy and efficiency. (...) Tables 1–4 report the results for CIFAR10 and CIFAR100. It is clear from Tables 1 and 2 that our method achieves the best accuracy, with and without warm start, for 5%, 20%, and 30% subset selection on CIFAR10.
Researcher Affiliation Collaboration 1Data Heroes, Israel. 2Department of Computer science, Rice university. 3CSAIL, MIT, Cambridge, USA. 4Department of computer science, University of Haifa, Israel.
Pseudocode Yes Algorithm 1 CORESET(P, w, R, m) input A set P Rd of n points, a weight function w : P [0, ), a bound on radius of the query space X, and a sample size m 1 output A pair (S, v) that satisfies Theorem 3.2
Open Source Code Yes Finally, we provide an open-source code implementation of our algorithm for reproducing our results and future research (ope, 2023). Open source code for all the algorithms presented in this paper, 2023. Link for open-source code.
Open Datasets Yes We performed our experiments for training CIFAR10 and CIFAR100 (Krizhevsky et al., 2009) on Res Net18 (He et al., 2016), MNIST (Le Cun et al., 1998) on Le Net, and Image Net-2012 (Deng et al., 2009) on Resnet18 (He et al., 2016).
Dataset Splits No The paper mentions training on subsets and reporting test accuracy, but it does not explicitly specify a validation dataset split or the use of a validation set.
Hardware Specification Yes All experiments were executed on V100 GPUs.
Software Dependencies No The paper mentions optimizers and learning rate schedulers (SGD, cosine annealing) but does not provide specific software versions for libraries like PyTorch, TensorFlow, or scikit-learn, etc.
Experiment Setup Yes We adapted the same setting of (Killamsetty et al., 2021a), where we used SGD optimizer for training initial learning rate equal to 0.01, a momentum of 0.9, and a weight decay of 5e 4. We decay the learning rate using cosine annealing (Loshchilov & Hutter, 2016) for each epoch. For MNIST, we trained the Le Net model for 200 epochs. For CIFAR10 and CIFAR100, we trained the Res Net18 for 300 epochs all on batches of size 20 for the subset selection training versions.