Implicit Bias of Gradient Descent based Adversarial Training on Separable Data
Authors: Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct numerical experiments on linear classifiers to backup our theoretical findings. We further empirically extend our method to neural networks, where our numerical results demonstrate that our theoretical results can be potentially generalized. |
| Researcher Affiliation | Academia | Yan Li, Huan Xu, Tuo Zhao H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30318 {yli939, huan.xu, tourzhao}@gatech.edu Ethan X.Fang Department of Statistics Pennsylvania State University University Park, PA 16802 xxf13@psu.edu |
| Pseudocode | Yes | Algorithm 1 Gradient Descent based Adversarial Training (GDAT) with ℓq-norm Perturbation |
| Open Source Code | No | No statement or link explicitly providing access to the source code for the methodology described in this paper was found. |
| Open Datasets | Yes | We take the two classes from MNIST dataset with label 2" and 9" to form our training set S. |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) beyond mentioning training and test sets. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments were mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers were mentioned. The paper discusses using stochastic gradient descent but does not list any libraries or frameworks with versions. |
| Experiment Setup | Yes | For standard clean training and the outer minimization problem in (2), we use the stochastic gradient descent algorithm with batch size 128 and constant stepsize 10 5. ... we solve the inner problem approximately using projected gradient descent with 20 iterations and stepsize 0.01. We test two versions of GDAT, where one adopts ℓ2-norm perturbations (c = 2.8), and the other uses ℓ norm perturbations (c = 0.1). |