A Kernel Theory of Modern Data Augmentation
Authors: Tri Dao, Albert Gu, Alexander Ratner, Virginia Smith, Chris De Sa, Christopher Re
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide several proof-of-concept applications showing that our theory can be useful for accelerating machine learning workflows, such as reducing the amount of computation needed to train using augmented data, and predicting the utility of a transformation prior to training. and We empirically validate the firstand second-order approximations, ˆg(w) and g(w), on MNIST (Le Cun et al., 1998) and CIFAR-10 (Krizhevsky & Hinton, 2009) datasets, performing rotation, crop, or blur as augmentations, and using either an RBF kernel with random Fourier features (Rahimi & Recht, 2007) or Le Net (details in Appendix E.1) as a base model. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Stanford University, California, USA 2Department of Electrical and Computer Engineering, Carnegie Mellon University, Pennsylvania, USA 3Department of Computer Science, Cornell University, New York, USA. |
| Pseudocode | No | The paper does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to reproduce experiments and plots: https:// github.com/Hazy Research/augmentation_code |
| Open Datasets | Yes | on MNIST (Le Cun et al., 1998) and CIFAR-10 (Krizhevsky & Hinton, 2009) datasets and real-world mammography tumor-classification dataset, DDSM (Heath et al., 2000; Clark et al., 2013; Lee et al., 2016). |
| Dataset Splits | No | The paper mentions using MNIST and CIFAR-10 datasets for empirical validation, but it does not provide specific details on training, validation, or test splits (e.g., percentages, sample counts, or explicit splitting methodology) within the provided text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper mentions models like RBF kernel and Le Net, but it does not specify software dependencies (e.g., libraries, frameworks, or solvers) with version numbers that would be needed to replicate the experiments. |
| Experiment Setup | Yes | In particular, in Figure 1a, we plot the difference after 10 epochs of SGD training... and All models are RBF kernel classifiers with 10,000 random Fourier features... and We augment via rotation between 15 and 15 degrees. |