A Direct Approach to Robust Deep Learning Using Adversarial Networks

Authors: Huaxia Wang, Chun-Nam Yu

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show empirically that our adversarial network approach works well against black box attacks, with performance on par with state-of-art methods such as ensemble adversarial training and adversarial training with projected gradient descent.Experimental results are shown in Section 4, with conclusions of the paper in Section 5.
Researcher Affiliation Collaboration Huaxia Wang Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030, USA hwang38@stevens.edu Chun-Nam Yu Nokia Bell Labs 600 Mountain Avenue Murray Hill, NJ 07974, USA chun-nam.yu@nokia-bell-labs.com
Pseudocode No The paper describes mathematical equations and network architectures but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is available at https://github.com/whxbergkamp/Robust DL_GAN.
Open Datasets Yes For MNIST the inputs are black and white images of digits of size 28x28 with pixel values scaled between 0 and 1.For the Street View House Number(SVHN) data, we use the original training set, augmented with 80k randomly sampled images from the extra set as our training data.For CIFAR10 we scale the 32x32 inputs to the range of [-1,1].
Dataset Splits No The paper describes training and test sets for its experiments (MNIST, SVHN, CIFAR10, CIFAR100) and mentions data augmentation, but it does not explicitly specify a separate validation dataset split or a cross-validation strategy for hyperparameter tuning or model selection.
Hardware Specification Yes We implemented our adversarial network approach using Tensorflow(Abadi et al., 2016), with the experiments run on several machines each with 4 GTX1080 Ti GPUs.
Software Dependencies No We implemented our adversarial network approach using Tensorflow(Abadi et al., 2016)... The paper mentions TensorFlow and PyTorch but does not specify their version numbers or any other software dependencies with specific versions.
Experiment Setup Yes We use SGD with learning rate of ηD = 0.01 and momentum 0.9, batch size of 64, and run for 200k iterations for all the discriminative networks. The learning rates are decreased by a factor of 10 after 100k iterations. We use SGD with a fixed learning rate ηG = 0.01 with momentum 0.9 for the generative network. We use weight decay of 1E-4 for standard and adversarial PGD training, and 1E-5 for our adversarial network approach (for both Dθ and Gφ). For this dataset we find that we can improve the robustness of Dθ by running more updates on Gφ, so we run 5 updates on Gφ (each update contains 5 gradient steps described in Section 3.2 ) for each update on Dθ.