Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints
Authors: Xin Li, Xiangrui Li, Deng Pan, Dongxiao Zhu8482-8490
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively compare our method with the state-of-the-art using large scale datasets under both white-box and blackbox attacks to demonstrate its effectiveness. The source codes are available at https://github.com/xinli0928/PC-LC.Experimental results show that when trained with our method, CNNs can achieve significantly improved robustness against adversarial samples without compromising performance on predicting clean samples. |
| Researcher Affiliation | Academia | Xin Li*,Xiangrui Li*, Deng Pan*, Dongxiao Zhu Department of Computer Science, Wayne State University, Detroit, MI 48202 {xiangruili, xinlee, pan.deng, dzhu}@wayne.edu |
| Pseudocode | No | No pseudocode or algorithm blocks are provided in the paper. |
| Open Source Code | Yes | The source codes are available at https://github.com/xinli0928/PC-LC. |
| Open Datasets | Yes | Datasets and models: We analyze seven benchmark datasets: MNIST, KMNIST, Fashion-MNIST (FMNIST), CIFAR-10, CIFAR-100, Street-View House Numbers (SVHN), and Tiny Imagenet. |
| Dataset Splits | No | The paper does not explicitly provide validation dataset splits. It mentions warming up the training process for a certain number of epochs but does not specify a separate validation set. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam) and model architectures (Le Net-5, VGG-13, Res Net-56) but does not provide specific software libraries or their version numbers (e.g., PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | All these models are trained using Adam optimizer with a initial learning rate of 0.01 and a batch size of 256. For our method, we first warm up the training process for T epochs (T = 50 for K/F/MNIST and T = 150 for other datasets) using CE loss, and then train the model using our method shown in Eqs. (6) and (16) (ξ = 0.995, λ = 0.05) for another T epochs whereas we directly train the baseline using CE loss for 2T epochs. The number of iterations is set to 10 for BIM and 40 for MIM and PGD while perturbation of each step is 0.01. For parameters of optimization-based attack C&W, the maximum iteration steps are set to 100, with a learning rate of 0.001, and the confidence is set to 0. The learning rate of SPSA is set to 0.01, and the step size is δ = 0.01 (Uesato et al. 2018). |