AugMax: Adversarial Composition of Random Augmentations for Robust Training
Authors: Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, Zhangyang Wang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that Aug Max-Du BIN leads to significantly improved out-of-distribution robustness, outperforming prior arts by 3.03%, 3.49%, 1.82% and 0.71% on CIFAR10-C, CIFAR100-C, Tiny Image Net-C and Image Net-C. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering, University of Texas at Austin 2NVIDIA 3Arizona State University 4California Institute of Technology |
| Pseudocode | Yes | The algorithm to solve Eq. (5) is summarized in Appendix A. |
| Open Source Code | Yes | Codes and pretrained models are available: https://github.com/VITA-Group/Aug Max. |
| Open Datasets | Yes | We evaluate our proposed method on CIFAR10, CIFAR100 [31], Image Net [32] and Tiny Image Net (TIN)3. |
| Dataset Splits | No | The paper uses standard public datasets (CIFAR10/100, ImageNet, Tiny ImageNet) which inherently have train/test splits, and explicitly mentions using corrupted test sets for evaluation. However, it does not specify explicit numerical splits for training, validation, and testing, nor does it explicitly state the use of a distinct validation set with its split information. |
| Hardware Specification | Yes | All experiments are conducted on a server with four NVIDIA RTX A6000 GPUs. |
| Software Dependencies | No | The paper mentions optimizers (SGD) and model architectures (ResNet, WRN, ResNeXt) but does not provide specific version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Specifically, for all experiments on CIAFR10 and CIFAR100, we use SGD optimizer with initial learning rate 0.1 and cosine annealing learning rate scheduler, and train all models for 200 epochs. For all experiments on Image Net, we use SGD optimizer with initial learning rate 0.1 and batch size 256 to train the model for 90 epochs. We set batch size to 256 for all experiments. Both m and p are updated with step size α = 0.1. We set n = 5 on Image Net for efficiency and n = 10 on other datasets. We set λ in Eq. (5) to be 12 on Image Net and 10 on all other datasets, except for Res Net18 on CIFAR100 where we find λ = 1 leads to better performance. |