Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Augmentation via Data Subsampling
Authors: Michael Kuchnik, Virginia Smith
ICLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004). |
| Researcher Affiliation | Academia | Michael Kuchnik & Virginia Smith Carnegie Mellon University EMAIL |
| Pseudocode | No | The paper describes its methods in prose and mathematical equations but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available online2. 2https://github.com/mkuchnik/Efficient_Augmentation |
| Open Datasets | Yes | We perform experiments throughout on common benchmark datasets, such as MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009), and NORB (Le Cun et al., 2004). |
| Dataset Splits | No | The paper specifies training and test class splits for datasets (e.g., 'The MNIST train class split is 517/483, and its test class split is 1010/974.') but does not explicitly define a separate validation dataset split or a general cross-validation setup for the main experiment evaluation. |
| Hardware Specification | Yes | The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1. |
| Software Dependencies | Yes | Tensor๏ฌow (Abadi et al., 2015) version 1.10.1 with a variable number of training examples obtained from CIFAR10. The system which was used for the test has an Intel i7-6700k and an Nvidia GTX 1080 using CUDA 9.2 and Cu DNN 7.2.1. |
| Experiment Setup | Yes | Both Le Net and the Keras neural network were fast to train, so we retrained the models for 40 50 epochs with Adam (Kingma & Ba, 2014) and a minibatch size of 512, which was enough to obtain convergence. |