Regularizing Deep Networks Using Efficient Layerwise Adversarial Training
Authors: Swami Sankaranarayanan, Arpit Jain, Rama Chellappa, Ser Nam Lim
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use these perturbations to train very deep models such as Res Nets and Wide Res Nets and show improvement in performance across datasets of different sizes such as CIFAR-10, CIFAR-100 and Image Net. Our ablative experiments show that the proposed approach not only provides stronger regularization compared to Dropout but also improves adversarial robustness comparable to traditional adversarial training approaches. |
| Researcher Affiliation | Collaboration | Swami Sankaranarayanan University of Maryland College Park, MD swamiviv@umiacs.umd.edu Arpit Jain GE Global Research Niskayuna, NY arpit.jain@ge.com Rama Chellappa University of Maryland College Park, MD rama@umiacs.umd.edu Ser Nam Lim GE Global Research Niskayuna, NY limser@ge.com |
| Pseudocode | Yes | Algorithm 1 Efficient layerwise adversarial training procedure for improved regularization |
| Open Source Code | No | The paper states: 'For the Res Net networks, we use the publicly available torch implementation (Res 2017). For the VGG architecture, we use a publicly available implementation which consists of Batch Normalization (VGG 2017). For Alex Net: We used the publicly available implementation from the torch platform (Ale 2017)'. These refer to implementations of base models they used, not the source code for their proposed methodology. |
| Open Datasets | Yes | We use these perturbations to train very deep models such as Res Nets and Wide Res Nets and show improvement in performance across datasets of different sizes such as CIFAR-10, CIFAR-100 and Image Net. Krizhevsky, A., and Hinton, G. 2009. Learning multiple layers of features from tiny images, https://www.cs.toronto.edu/ kriz/cifar.html. |
| Dataset Splits | Yes | Imagenet Experiment: To test the applicability of our regularization approach over a large scale dataset, we conducted an experiment using the Image Net dataset (train: 1.2M images, val: 50K images). |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'torch implementation' for various models but does not specify any version numbers for Torch or other software dependencies. |
| Experiment Setup | Yes | For all the experiments, we use the SGD solver with Nesterov momentum of 0.9. The base learning rate is 0.1 and it is dropped by 5 every 60 epochs in case of CIFAR-100 and every 50 epochs in case of CIFAR-10. The total training duration is 300 epochs. We employ random flipping as a data augmentation procedure and standard mean/std preprocessing was applied conforming to the original implementations. |