reproducibilityindex.ai

Adversarial Defence by Diversified Simultaneous Training of Deep Ensembles

Authors: Bo Huang, Zhiwei Ke, Yi Wang, Wei Wang, Linlin Shen, Feng Liu7823-7831

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive evaluations under white-box and blackbox attacks including transferred examples and adaptive attacks. Our approach achieves a signiﬁcant gain of up to 52% in adversarial robustness, compared with the baseline and the state-of-the-art method on image benchmarks with complex data scenes.
Researcher Affiliation	Academia	Bo Huang1 *, Zhiwei Ke1,2 , Yi Wang1 , Wei Wang3, Linlin Shen2,4, Feng Liu2 1Dongguan University of Technology, Dongguan, China 2Computer Vision Institute, Shenzhen University, Shenzhen, China 3The University of New South Wales, Sydney, Australia 4Shenzhen Institute of Artiﬁcial Intelligence & Robotics for Society
Pseudocode	Yes	Algorithm 1 outlines the main procedures. Speciﬁcally, we divide the ensemble range of activation strength into M > K intervals and count the number of neurons of K base networks that fall in the intervals G1, G2, ..., GM, respectively. ... Algorithm 2 outlines the main procedure of our gradient regularization.
Open Source Code	Yes	The source code is available at https://github.com/ALIS-Lab/AAAI2021-PDD.
Open Datasets	Yes	Datasets. We evaluate our method on three image benchmarks with increasing complexity and cluttered scenes, namely Fashion-MNIST, CIFAR-100, and Tiny-Image Net. In particular, Fashion-MNIST 3 has 10-class (L = 10) labels and consists of 60k training samples and 10k testing samples each with 28 28 resolution; CIFAR-100 has 100-class (L = 100) labels and contains 50k training samples and 10k testing samples each with 32 32 3 resolution; Tiny-Image Net has 200 classes (L = 200), containing 100k and 10k samples each of 64 64 3 resolution for training and validation testing, respectively. In all cases, the image intensity is normalized to 1 in our experiments.
Dataset Splits	Yes	Fashion-MNIST 3 has 10-class (L = 10) labels and consists of 60k training samples and 10k testing samples each with 28 28 resolution; CIFAR-100 has 100-class (L = 100) labels and contains 50k training samples and 10k testing samples each with 32 32 3 resolution; Tiny-Image Net has 200 classes (L = 200), containing 100k and 10k samples each of 64 64 3 resolution for training and validation testing, respectively.
Hardware Specification	Yes	We test the training time per epoch with a mini-batch size of 64 on CIFAR-100. When K = 3, for example, it takes 54s/epoch for baseline, 64s/epoch for ADP, 93s/epoch for PDD, and 703s/epoch for DEG on Tesla V100.
Software Dependencies	Yes	Our implementation is based on Pytorch and the Adversarial Robustness 360 Toolbox (ART) v1.1 library 4.
Experiment Setup	Yes	The PDD method is applied to the last FC layer of 512 neurons before the softmax layer. Two cases of K = 3 and K = 5 are tested with model parameters set as described in the PDD section. Note that our methods do not require any speciﬁcation on L. For the PDD regularization, we set α = 0.9 and β = 0.1 for computing the keep rate in (2) in all our experiments. Unless otherwise speciﬁed, we choose M = 10 for K = 3 and M = 20 for K = 5 empirically. For the DEG regularization, we set λ = 0.01 to control the penalty strength in (3). The learning rate is set to 0.001 for C&W with 1000 iteration steps. We test the training time per epoch with a mini-batch size of 64 on CIFAR-100.