Implicit Bias of Adversarial Training for Deep Neural Networks
Authors: Bochen Lv, Zhanxing Zhu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct numerical experiments on MNIST dataset to support our claims. We adversarially trained a 3-layer neural network using SGD with constant learning rate and batch-size 80. The model has the architecture of input layer-1024-Re LU-64-Re LU-output layer. We present results for adversarial training with: (1) FGSM perturbations with ϵ = 16/255; (2) ℓ -PGD perturbations, where the PGD is ran for 5 steps with step size 6/255 and ϵ = 16/255. |
| Researcher Affiliation | Collaboration | Bochen Lyu Data Canvas Lab Data Canvas, Beijing, China lvbc@zetyun.com Zhanxing Zhu The University of Edinburgh, UK zhanxing.zhu@gmail.com |
| Pseudocode | Yes | Algorithm 1 Adversarial Training Input: Training set S = {(xi, yi)}n i=1, Adversary A to solve the inner maximization, learning rate η, initialization Wk for k {1, . . . , L} for t = 0 to T 1 do S (t) = for i = 1 to n do x i(t) = A(xi, yi, W(t)) S (t) = S (t) S(x i(t), yi) end for for k = 1 to L do Wk(t + 1) = Wk(t) η(t) L(S (t);W ) Wk end for end for |
| Open Source Code | No | The paper does not contain any statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | In this section, we conduct numerical experiments on MNIST dataset to support our claims. |
| Dataset Splits | No | The paper uses the MNIST dataset but does not explicitly describe training, validation, and test splits or a specific validation methodology like k-fold cross-validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, cloud instances) used for running experiments. |
| Software Dependencies | No | The paper mentions using SGD for training but does not provide any specific software dependencies with version numbers. |
| Experiment Setup | Yes | We adversarially trained a 3-layer neural network using SGD with constant learning rate and batch-size 80. The model has the architecture of input layer-1024-Re LU-64-Re LU-output layer. We present results for adversarial training with: (1) FGSM perturbations with ϵ = 16/255; (2) ℓ -PGD perturbations, where the PGD is ran for 5 steps with step size 6/255 and ϵ = 16/255. |