Large Norms of CNN Layers Do Not Hurt Adversarial Robustness

Authors: Youwei Liang, Dong Huang8565-8573

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments Firstly, we show our approaches for computing norms of Conv2d are very efficient. In the second part, we conduct extensive experiments to investigate if regularizing the norms of CNN layers is effective in improving adversarial robustness. In the third part, we compare the norms of the layers of adversarially robust CNNs against their non-adversarially robust counterparts.
Researcher Affiliation Academia 1 College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China 2 Pazhou Lab, Guangzhou, China
Pseudocode Yes Algorithm 1 Norm Decay Input: loss function L (assuming it is to be minimized), parameters θ, momentum γ, regularization parameter β Output: parameters θ 1: h 0 (initialize the gradient of norms of layers) 2: repeat 3: g θL 4: Compute p, the gradient of ℓ1 or ℓ norm of each fully-connected and convolutional layer 5: h γ h + (1 γ) p 6: g g + β/N h 7: θ SGD(θ, g) 8: until convergence
Open Source Code Yes The code is available at https://github.com/youweiliang/norm robustness.
Open Datasets Yes We conduct experiments with various models on CIFAR-10 (Krizhevsky and Hinton 2009).
Dataset Splits No The paper mentions training on CIFAR-10 and evaluating on a 'test set', but it does not specify exact training/validation/test split percentages, absolute sample counts for each split, or reference predefined splits with explicit citations for the split methodology.
Hardware Specification Yes The experiments are conducted on a machine a GTX 1080 Ti GPU and an Intel Core i5-9400F 6-core CPU and 32GB RAM.
Software Dependencies No The paper mentions 'SGD optimizer' and 'Auto Attack' (which is a tool), but it does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup Yes We set the regularization parameter to different values... In norm decay, we simply set the hyperparameter γ (momentum) to 0.5 and test the other hyperparameter β in {10 5, . . . , 10 2}. We also test the regularization parameter of weight decay in {10 5, . . . , 10 2} and test SVC by clipping the singular values to {2.0, 1.5, 1.0, 0.5}, respectively... We use the SGD optimizer with momentum of 0.9 and set the initial learning to 0.01. We train the models for 120 epochs and decay the learning rate by a factor of 0.1 at epoch 75, 90, and 100.