reproducibilityindex.ai

A Group-Theoretic Framework for Data Augmentation

Authors: Shuxiao Chen, Edgar Dobriban, Jane Lee

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study ﬁnite-sample and asymptotic empirical risk minimization and work out as examples the variance reduction in certain two-layer neural networks. We further propose a strategy to exploit the beneﬁts of data augmentation for general learning tasks. Figure 1: Beneﬁts of data augmentation. Fig. (a) shows the test accuracy across training epochs of Res Net18 on CIFAR10 (1) without data augmentation, (2) horizontally ﬂipping the image with 0.5 probability, and (3) randomly cropping a 32 32 portion of the image + random horizontal ﬂip (See Appendix D for details).
Researcher Affiliation	Academia	Shuxiao Chen Department of Statistics University of Pennsylvania shuxiaoc@wharton.upenn.edu Edgar Dobriban Department of Statistics University of Pennsylvania dobriban@wharton.upenn.edu Jane H. Lee Department of Computer Science University of Pennsylvania janehlee@sas.upenn.edu
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets	Yes	(a) Training Res Net18 on CIFAR-10
Dataset Splits	No	The paper mentions training on CIFAR-10 and showing test accuracy, but it does not specify the dataset splits (e.g., percentage for training, validation, or test sets) required for reproduction.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions using ResNet18, but does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup	Yes	Appendix D. Details of Figure 1. We train a ResNet18 [33] on CIFAR-10 [43] with three different setups: (1) no data augmentation, (2) random horizontal ﬂip with probability 0.5, and (3) random crop a 32 32 portion of the image and then random horizontal ﬂip with probability 0.5. We use SGD with momentum (0.9), learning rate 0.01 with cosine annealing learning rate schedule, batch size 128, and train for 200 epochs.