Efficient Augmentation via Data Subsampling

Authors: Michael Kuchnik, Virginia Smith

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004).
Researcher Affiliation Academia Michael Kuchnik & Virginia Smith Carnegie Mellon University {mkuchnik,smithv}@cmu.edu
Pseudocode No The paper describes its methods in prose and mathematical equations but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code is publicly available online2. 2https://github.com/mkuchnik/Efficient_Augmentation
Open Datasets Yes We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004).
Dataset Splits No The paper specifies training and test class splits for datasets (e.g., 'The MNIST train class split is 517/483, and its test class split is 1010/974.') but does not explicitly define a separate validation dataset split or a general cross-validation setup for the main experiment evaluation.
Hardware Specification Yes The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1.
Software Dependencies Yes Tensorflow (Abadi et al., 2015) version 1.10.1 with a variable number of training examples obtained from CIFAR10. The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1.
Experiment Setup Yes Both Le Net and the Keras neural network were fast to train, so we retrained the models for 40 50 epochs with Adam (Kingma & Ba, 2014) and a minibatch size of 512, which was enough to obtain convergence.