A General Analysis of Example-Selection for Stochastic Gradient Descent
Authors: Yucheng Lu, Si Yi Meng, Christopher De Sa
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we evaluate our two algorithms on several image classification benchmarks including MNIST, CIFAR10/100 and Image Net. We show with QMC-based data augmentation, a higher validation accuracy can be achieved without hyperparameter tuning this suggests that QMC may be a good default driver to use with data augmentation for deep learning in general. Meanwhile, the greedy algorithm converges faster both in terms of iteration and wall-clock time (Section 6). |
| Researcher Affiliation | Academia | Yucheng Lu , Si Yi Meng , Christopher De Sa Department of Computer Science Cornell University Ithaca, NY 14853, USA {yl2967,sm2833,cmd353}@cornell.edu |
| Pseudocode | Yes | Algorithm 1 Example-Ordered SGD via Greedily Minimizing Average Gradient Error |
| Open Source Code | Yes | Our code is available at: https://github.com/EugeneLYC/qmc-ordering. |
| Open Datasets | Yes | Empirically, we evaluate our two algorithms on several image classification benchmarks including MNIST, CIFAR10/100 and Image Net. ... In addition to using synthetic data, we also performed an offline version of the experiment Figure 1(b) on a real dataset, a6a from the LIBSVM repository (Chang & Lin, 2011). |
| Dataset Splits | Yes | Empirically, we evaluate our two algorithms on several image classification benchmarks including MNIST, CIFAR10/100 and Image Net. ... We start by training Res Net20 on CIFAR10 and CIFAR100, where discrete and continuous data augmentations are applied, respectively. |
| Hardware Specification | Yes | In Section 6, all the training scripts are implemented via Py Torch1.6 and run on a single machine configured with an 2.6GHz 4-core Intel (R) Xeon(R) CPU, 16GB memory and NVIDIA Ge Force GTX 1080Ti with CUDA 10.1. |
| Software Dependencies | Yes | In Section 6, all the training scripts are implemented via Py Torch1.6 and run on a single machine configured with an 2.6GHz 4-core Intel (R) Xeon(R) CPU, 16GB memory and NVIDIA Ge Force GTX 1080Ti with CUDA 10.1. |
| Experiment Setup | Yes | We run the baseline IID-uniform method with finetuned hyperparameters (weight decay 10 4), which reproduces the result from He et al. (2016) with an error rate 8.4%. Then we run QMC-base augmentation with the same hyperparameter (untuned) and finetuned counterparts, with a grid search over weight decay values in {r 10 4}4 r=1. |