Understanding the detrimental class-level effects of data augmentation

Authors: Polina Kirichenko, Mark Ibrahim, Randall Balestriero, Diane Bouchacourt, Shanmukha Ramakrishna Vedantam, Hamed Firooz, Andrew G. Wilson

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we present a framework for understanding how DA interacts with class-level learning dynamics. Using higher-quality multi-label annotations on Image Net, we systematically categorize the affected classes and find that the majority are inherently ambiguous, co-occur, or involve fine-grained distinctions, while DA controls the model s bias towards one of the closely related classes. While many of the previously reported performance drops are explained by multi-label annotations, our analysis of class confusions reveals other sources of accuracy degradation. We show that simple class-conditional augmentation strategies informed by our framework improve performance on the negatively affected classes.
Researcher Affiliation Collaboration Polina Kirichenko1,2 Mark Ibrahim2 Randall Balestriero2 Diane Bouchacourt2 Ramakrishna Vedantam2 Hamed Firooz2 Andrew Gordon Wilson1 1 New York University 2 Meta AI
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes We train Res Net-50 models [22] trained on Image Net [53]
Dataset Splits Yes Rea L labels. Beyer et al. [4] used large-scale vision models to generate new label proposals for Image Net validation set which were then evaluated by human annotators. We use fs( ) to denote a neural network trained with augmentation parameter s, l Rea L(x) a set of Rea L labels for a validation example x, X a set of all validation images, Xk the validation examples with the original label k.
Hardware Specification Yes The experiments were run on GPU clusters on Nvidia Tesla V100, Titan RTX, RTX8000, 3080 and 1080Ti GPUs.
Software Dependencies Yes We use Py Torch [47], automatic mixed precision training with torch.amp package3, ffcv package [34] for fast data loading. (FFCV [34] refers to a GitHub link with a 'commit xxxxxxx' hash, which serves as a specific version identifier)
Experiment Setup Yes We train Res Net-50 for 88 epochs using label smoothing with α = 0.1 [60]. We use image resolution Rtrain = 176 during training and evaluate on images with resolution Rtest = 224... We apply random horizontal flips and Random Resized Crop (RRC) DA when training our models...We train Res Net-50 models for 88 epochs with SGD with momentum 0.9, using batch size 1024, weight decay 10 4, and label smoothing 0.1. We use cyclic learning rate schedule starting from the initial learning rate 10 4 with the peak value 1 after 2 epochs and linearly decaying to 0 until the end of training.