Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Authors: Huimin Wu, Zhengmian Hu, Bin Gu10329-10337

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experimental results show that our adversarial training algorithm enjoys robustness against various attacks and meanwhile has the similar efficiency and scalability with classical DSG algorithm.
Researcher Affiliation Collaboration Huimin Wu1, Zhengmian Hu2, Bin Gu1,3,4 1 School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 2Department of Electrical & Computer Engineering, University of Pittsburgh, PA, USA 3JD Finance America Corporation, Mountain View, CA, USA 4MBZUAI, United Arab Emirates
Pseudocode Yes Algorithm 1 {αi}t i=1 = Train(P(x, y))
Open Source Code No The DSG code is available at https://github.com/zixu1986/Doubly Stochastic Gradients. (This link is for the DSG framework on which their model is based, not their specific adv-SVM implementation code.)
Open Datasets Yes Datasets. We evaluate the robustness of adv-SVM on two well-known datasets, MNIST (Lecun and Bottou 1998) and CIFAR10 (Krizhevsky and Hinton 2009).
Dataset Splits Yes 5-fold cross validation is used to choose the optimal hyper-parameters (the regularization parameter C and the step size γ).
Hardware Specification Yes We perform experiments on Intel Xeon E5-2696 machine with 48GB RAM.
Software Dependencies No This algorithm is implemented in CVX a package for specifying and solving convex programs (Grant and Boyd 2014). (While CVX is mentioned, no specific version number is provided for CVX or any other software dependency.)
Experiment Setup Yes For FGSM and PGD, the maximum perturbation ϵ is set as 8/255 and the step size for PGD is ϵ/4. ... For ZOO, we use the ZOO-ADAM algorithm and set the step size η = 0.01, ADAM parameters β1 = 0.9, β2 = 0.999. ... the number of random features is set as 210 and the batch size is 500. 5-fold cross validation is used to choose the optimal hyper-parameters (the regularization parameter C and the step size γ). The parameters C and γ are searched in the region {(C, γ)| 3 log2 C 3 , 3 log2 γ 3}.