reproducibilityindex.ai

Efficient Augmentation via Data Subsampling

Authors: Michael Kuchnik, Virginia Smith

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004).
Researcher Affiliation	Academia	Michael Kuchnik & Virginia Smith Carnegie Mellon University {mkuchnik,smithv}@cmu.edu
Pseudocode	No	The paper describes its methods in prose and mathematical equations but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is publicly available online2. 2https://github.com/mkuchnik/Efficient_Augmentation
Open Datasets	Yes	We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004).
Dataset Splits	No	The paper specifies training and test class splits for datasets (e.g., 'The MNIST train class split is 517/483, and its test class split is 1010/974.') but does not explicitly define a separate validation dataset split or a general cross-validation setup for the main experiment evaluation.
Hardware Specification	Yes	The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1.
Software Dependencies	Yes	Tensorﬂow (Abadi et al., 2015) version 1.10.1 with a variable number of training examples obtained from CIFAR10. The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1.
Experiment Setup	Yes	Both Le Net and the Keras neural network were fast to train, so we retrained the models for 40 50 epochs with Adam (Kingma & Ba, 2014) and a minibatch size of 512, which was enough to obtain convergence.