Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

Authors: Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct numerical experiments on linear classifiers to backup our theoretical findings. We further empirically extend our method to neural networks, where our numerical results demonstrate that our theoretical results can be potentially generalized.
Researcher Affiliation Academia Yan Li, Huan Xu, Tuo Zhao H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30318 {yli939, huan.xu, tourzhao}@gatech.edu Ethan X.Fang Department of Statistics Pennsylvania State University University Park, PA 16802 xxf13@psu.edu
Pseudocode Yes Algorithm 1 Gradient Descent based Adversarial Training (GDAT) with ℓq-norm Perturbation
Open Source Code No No statement or link explicitly providing access to the source code for the methodology described in this paper was found.
Open Datasets Yes We take the two classes from MNIST dataset with label 2" and 9" to form our training set S.
Dataset Splits No The paper does not explicitly provide details about training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) beyond mentioning training and test sets.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments were mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers were mentioned. The paper discusses using stochastic gradient descent but does not list any libraries or frameworks with versions.
Experiment Setup Yes For standard clean training and the outer minimization problem in (2), we use the stochastic gradient descent algorithm with batch size 128 and constant stepsize 10 5. ... we solve the inner problem approximately using projected gradient descent with 20 iterations and stepsize 0.01. We test two versions of GDAT, where one adopts ℓ2-norm perturbations (c = 2.8), and the other uses ℓ norm perturbations (c = 0.1).