Boosting Adversarial Training with Hypersphere Embedding
Authors: Tianyu Pang, Xiao Yang, Yinpeng Dong, Kun Xu, Jun Zhu, Hang Su
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the effectiveness and adaptability of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the Free AT and Fast AT strategies. In the experiments, we evaluate our methods under a wide range of adversarial attacks on the CIFAR-10 and Image Net datasets, which verifies that integrating HE can consistently enhance the model robustness for each AT framework with little extra computation. 4 Experiments CIFAR-10 [31] setup. We apply the wide residual network WRN-34-10 as the model architecture [77]. For each AT framework, we set the maximal perturbation ϵ = 8/255, the perturbation step size η = 2/255, and the number of iterations K = 10. We apply the momentum SGD [49] optimizer with the initial learning rate of 0.1, and train for 100 epochs. |
| Researcher Affiliation | Collaboration | Tianyu Pang , Xiao Yang , Yinpeng Dong, Kun Xu, Jun Zhu, Hang Su Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, China {pty17, yangxiao19, dyp17}@mails.tsinghua.edu.cn kunxu.thu@gmail.com, {suhangss, dcszj}@mail.tsinghua.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks (e.g., labeled “Algorithm” or “Pseudocode”). |
| Open Source Code | Yes | Code is available at https://github.com/ShawnXYang/AT_HE. |
| Open Datasets | Yes | We validate the effectiveness and adaptability of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the Free AT and Fast AT strategies. In the experiments, we evaluate our methods under a wide range of adversarial attacks on the CIFAR-10 and Image Net datasets, which verifies that integrating HE can consistently enhance the model robustness for each AT framework with little extra computation. (and citations [31] for CIFAR-10 and [15] for ImageNet). |
| Dataset Splits | Yes | CIFAR-10 [31] setup. Image Net [15] setup. (Implicitly uses standard, well-defined splits for these benchmark datasets). |
| Hardware Specification | No | The paper mentions “four GPU workers” for Free AT on ImageNet but does not specify the type or model of the GPUs or any other hardware components (CPU, memory, specific cloud instances) used for the experiments. |
| Software Dependencies | No | The paper refers to specific optimizers and frameworks (e.g., “momentum SGD”, “PGD-AT”, “ALP”, “TRADES”) but does not provide specific version numbers for any software, libraries, or programming languages used in the implementation (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | CIFAR-10 [31] setup. We apply the wide residual network WRN-34-10 as the model architecture [77]. For each AT framework, we set the maximal perturbation ϵ = 8/255, the perturbation step size η = 2/255, and the number of iterations K = 10. We apply the momentum SGD [49] optimizer with the initial learning rate of 0.1, and train for 100 epochs. The learning rate decays with a factor of 0.1 at 75 and 90 epochs, respectively. The mini-batch size is 128. Besides, we set the regularization parameter 1/λ as 6 for TRADES, and set the adversarial logit pairing weight as 0.5 for ALP [29, 81]. The scale s = 15 and the margin m = 0.2 in HE... and similar details for Image Net setup. |