Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients
Authors: Huimin Wu, Zhengmian Hu, Bin Gu10329-10337
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experimental results show that our adversarial training algorithm enjoys robustness against various attacks and meanwhile has the similar efficiency and scalability with classical DSG algorithm. |
| Researcher Affiliation | Collaboration | Huimin Wu1, Zhengmian Hu2, Bin Gu1,3,4 1 School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 2Department of Electrical & Computer Engineering, University of Pittsburgh, PA, USA 3JD Finance America Corporation, Mountain View, CA, USA 4MBZUAI, United Arab Emirates |
| Pseudocode | Yes | Algorithm 1 {αi}t i=1 = Train(P(x, y)) |
| Open Source Code | No | The DSG code is available at https://github.com/zixu1986/Doubly Stochastic Gradients. (This link is for the DSG framework on which their model is based, not their specific adv-SVM implementation code.) |
| Open Datasets | Yes | Datasets. We evaluate the robustness of adv-SVM on two well-known datasets, MNIST (Lecun and Bottou 1998) and CIFAR10 (Krizhevsky and Hinton 2009). |
| Dataset Splits | Yes | 5-fold cross validation is used to choose the optimal hyper-parameters (the regularization parameter C and the step size γ). |
| Hardware Specification | Yes | We perform experiments on Intel Xeon E5-2696 machine with 48GB RAM. |
| Software Dependencies | No | This algorithm is implemented in CVX a package for specifying and solving convex programs (Grant and Boyd 2014). (While CVX is mentioned, no specific version number is provided for CVX or any other software dependency.) |
| Experiment Setup | Yes | For FGSM and PGD, the maximum perturbation ϵ is set as 8/255 and the step size for PGD is ϵ/4. ... For ZOO, we use the ZOO-ADAM algorithm and set the step size η = 0.01, ADAM parameters β1 = 0.9, β2 = 0.999. ... the number of random features is set as 210 and the batch size is 500. 5-fold cross validation is used to choose the optimal hyper-parameters (the regularization parameter C and the step size γ). The parameters C and γ are searched in the region {(C, γ)| 3 log2 C 3 , 3 log2 γ 3}. |