Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Trade-off between Adversarial and Backdoor Robustness

Authors: Cheng-Hsin Weng, Yan-Ting Lee, Shan-Hung (Brandon) Wu

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we conduct experiments to study whether adversarial robustness and backdoor robustness can affect each other and find a trade-off by increasing the robustness of a network to adversarial examples, the network becomes more vulnerable to backdoor attacks.
Researcher Affiliation Academia Cheng-Hsin Weng Yan-Ting Lee Shan-Hung Wu Department of Computer Science, National Tsing-Hua University, Taiwan, R.O.C.
Pseudocode No The paper describes methods and processes in narrative text and through experimental settings, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/nthu-datalab/On.the.Trade-off.between. Adversarial.and.Backdoor.Robustness.
Open Datasets Yes This finding is consistent on all the real-world datasets, including MNIST [23], CIFAR-10 [22], and Image Net [9], and across all the settings we have tested.
Dataset Splits No The paper mentions training, poisoning, and evaluation on a test set, but it does not explicitly specify a validation split percentage or how a validation set was used for hyperparameter tuning or model selection.
Hardware Specification Yes We implement all the models using Tensor Flow and train them on a cluster of machines with 80 NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper mentions "Tensor Flow" but does not specify a version number or other software dependencies with version numbers.
Experiment Setup Yes Specifically, we use the projected gradient descent (PGD) with an l1-norm constraint as the attack model of the adversarial training algorithm and set its parameters epsilon ( )/step size/number of iterations to 0.3/0.05/10 for MNIST, 8/2/5 for CIFAR-10, and 8/2/5 for Image Net, respectively. In terms of network architecture, we use a naive CNN for MNIST, Res Net-32 for CIFAR-10, and pretrained Res Net-50 for Image Net.