DiffAug: A Diffuse-and-Denoise Augmentation for Training Robust Classifiers

Authors: Chandramouli Shama Sastry, Sri Harsha Dumpala, Sageev Oore

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce Diff Aug, a simple and efficient diffusion-based augmentation technique to train image classifiers for the crucial yet challenging goal of improved classifier robustness. Applying Diff Aug to a given example consists of one forwarddiffusion step followed by one reverse-diffusion step. Using both Res Net-50 and Vision Transformer architectures, we comprehensively evaluate classifiers trained with Diff Aug and demonstrate the surprising effectiveness of single-step reverse diffusion in improving robustness to covariate shifts, certified adversarial accuracy and out of distribution detection.
Researcher Affiliation Academia Chandramouli S. Sastry, Sri Harsha Dumpala, Sageev Oore Dalhousie University, Canada.
Pseudocode No The paper describes methods in prose and does not include explicit pseudocode blocks or algorithms labeled as such.
Open Source Code Yes 1Code available at https://github.com/oore-lab/diffaug
Open Datasets Yes We primarily conduct our experiments on Imagenet-1k and use the unconditional 256 256 Improved-DDPM [14, 35] diffusion model to generate the augmentations.
Dataset Splits Yes Justification: We use the standard data splits and demonstrate the robustness to hyperparameter choice in our ablation study (Appendix B.5).
Hardware Specification Yes For this paper, we had access to 8 40GB A40 GPUs to conduct our training and evaluation.
Software Dependencies No The paper refers to specific models and training recipes (e.g., 'Res Net-50', 'Vi T-B-16', 'Improved-DDPM', 'De IT-III recipe', 'torchvision resnet-50'), but does not list specific software dependencies with version numbers (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes RN-50: We trained the model from scratch for 90 epochs using the same optimization hyperparameters used to train the official Py Torch RN-50 checkpoint. Vi T: We used the two-stage training recipe proposed in De IT-III. In particular, the training recipe consists of an 800-epoch supervised pretraining at a lower resolution (e.g., 192 192) followed by a 20-epoch finetuning at the target resolution (e.g., 224 224). Starting with the pretrained checkpoint (i.e., after 800 epochs), we finetune the classifier exactly following the prescribed optimization and augmentation hyperparameters (e.g., Auto Augment (AA) parameters and Mix Up/Cut Mix parameters) except that we also consider Diff Aug examples.