Convergence of Adversarial Training in Overparametrized Neural Networks

Authors: Ruiqi Gao, Tianle Cai, Haochuan Li, Cho-Jui Hsieh, Liwei Wang, Jason D. Lee

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper provides a partial answer to the success of adversarial training, by showing that it converges to a network where the surrogate loss with respect to the the attack algorithm is within of the optimal robust loss. Then we show that the optimal robust loss is also close to zero, hence adversarial training finds a robust classifier. The analysis technique leverages recent work on the analysis of neural networks via Neural Tangent Kernel (NTK), combined with motivation from online-learning when the maximization is solved by a heuristic, and the expressiveness of the NTK kernel in the -norm. In addition, we also prove that robust interpolation requires more model capacity, supporting the evidence that adversarial training requires wider networks.
Researcher Affiliation Academia Ruiqi Gao1, Tianle Cai1, Haochuan Li2 Liwei Wang3 Cho-Jui Hsieh4 Jason D. Lee5 1School of Mathematical Sciences, Peking University 2Department of EECS, Massachusetts Institute of Technology 3Key Laboratory of Machine Perception, MOE, School of EECS, Peking University 4Department of Computer Science, University of California, Los Angeles 5Department of Electrical Engineering, Princeton University
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statements about open-source code availability or links to repositories for its own methodology.
Open Datasets No The paper is theoretical and does not describe experiments using specific, publicly available datasets.
Dataset Splits No The paper is theoretical and does not describe an experimental setup, thus no dataset split information for training, validation, or testing is provided.
Hardware Specification No The paper is theoretical and does not describe any experimental hardware specifications.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not include details about an experimental setup or hyperparameters.