A Kernel Theory of Modern Data Augmentation

Authors: Tri Dao, Albert Gu, Alexander Ratner, Virginia Smith, Chris De Sa, Christopher Re

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide several proof-of-concept applications showing that our theory can be useful for accelerating machine learning workflows, such as reducing the amount of computation needed to train using augmented data, and predicting the utility of a transformation prior to training. and We empirically validate the firstand second-order approximations, ˆg(w) and g(w), on MNIST (Le Cun et al., 1998) and CIFAR-10 (Krizhevsky & Hinton, 2009) datasets, performing rotation, crop, or blur as augmentations, and using either an RBF kernel with random Fourier features (Rahimi & Recht, 2007) or Le Net (details in Appendix E.1) as a base model.
Researcher Affiliation Academia 1Department of Computer Science, Stanford University, California, USA 2Department of Electrical and Computer Engineering, Carnegie Mellon University, Pennsylvania, USA 3Department of Computer Science, Cornell University, New York, USA.
Pseudocode No The paper does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code to reproduce experiments and plots: https:// github.com/Hazy Research/augmentation_code
Open Datasets Yes on MNIST (Le Cun et al., 1998) and CIFAR-10 (Krizhevsky & Hinton, 2009) datasets and real-world mammography tumor-classification dataset, DDSM (Heath et al., 2000; Clark et al., 2013; Lee et al., 2016).
Dataset Splits No The paper mentions using MNIST and CIFAR-10 datasets for empirical validation, but it does not provide specific details on training, validation, or test splits (e.g., percentages, sample counts, or explicit splitting methodology) within the provided text.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions models like RBF kernel and Le Net, but it does not specify software dependencies (e.g., libraries, frameworks, or solvers) with version numbers that would be needed to replicate the experiments.
Experiment Setup Yes In particular, in Figure 1a, we plot the difference after 10 epochs of SGD training... and All models are RBF kernel classifiers with 10,000 random Fourier features... and We augment via rotation between 15 and 15 degrees.