reproducibilityindex.ai

Boosting Adversarial Training with Hypersphere Embedding

Authors: Tianyu Pang, Xiao Yang, Yinpeng Dong, Kun Xu, Jun Zhu, Hang Su

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the effectiveness and adaptability of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the Free AT and Fast AT strategies. In the experiments, we evaluate our methods under a wide range of adversarial attacks on the CIFAR-10 and Image Net datasets, which veriﬁes that integrating HE can consistently enhance the model robustness for each AT framework with little extra computation. 4 Experiments CIFAR-10 [31] setup. We apply the wide residual network WRN-34-10 as the model architecture [77]. For each AT framework, we set the maximal perturbation ϵ = 8/255, the perturbation step size η = 2/255, and the number of iterations K = 10. We apply the momentum SGD [49] optimizer with the initial learning rate of 0.1, and train for 100 epochs.
Researcher Affiliation	Collaboration	Tianyu Pang , Xiao Yang , Yinpeng Dong, Kun Xu, Jun Zhu, Hang Su Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, China {pty17, yangxiao19, dyp17}@mails.tsinghua.edu.cn kunxu.thu@gmail.com, {suhangss, dcszj}@mail.tsinghua.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks (e.g., labeled “Algorithm” or “Pseudocode”).
Open Source Code	Yes	Code is available at https://github.com/ShawnXYang/AT_HE.
Open Datasets	Yes	We validate the effectiveness and adaptability of HE by embedding it into the popular AT frameworks including PGD-AT, ALP, and TRADES, as well as the Free AT and Fast AT strategies. In the experiments, we evaluate our methods under a wide range of adversarial attacks on the CIFAR-10 and Image Net datasets, which veriﬁes that integrating HE can consistently enhance the model robustness for each AT framework with little extra computation. (and citations [31] for CIFAR-10 and [15] for ImageNet).
Dataset Splits	Yes	CIFAR-10 [31] setup. Image Net [15] setup. (Implicitly uses standard, well-defined splits for these benchmark datasets).
Hardware Specification	No	The paper mentions “four GPU workers” for Free AT on ImageNet but does not specify the type or model of the GPUs or any other hardware components (CPU, memory, specific cloud instances) used for the experiments.
Software Dependencies	No	The paper refers to specific optimizers and frameworks (e.g., “momentum SGD”, “PGD-AT”, “ALP”, “TRADES”) but does not provide specific version numbers for any software, libraries, or programming languages used in the implementation (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	CIFAR-10 [31] setup. We apply the wide residual network WRN-34-10 as the model architecture [77]. For each AT framework, we set the maximal perturbation ϵ = 8/255, the perturbation step size η = 2/255, and the number of iterations K = 10. We apply the momentum SGD [49] optimizer with the initial learning rate of 0.1, and train for 100 epochs. The learning rate decays with a factor of 0.1 at 75 and 90 epochs, respectively. The mini-batch size is 128. Besides, we set the regularization parameter 1/λ as 6 for TRADES, and set the adversarial logit pairing weight as 0.5 for ALP [29, 81]. The scale s = 15 and the margin m = 0.2 in HE... and similar details for Image Net setup.