Benign Overfitting in Adversarial Training of Neural Networks

Authors: Yunjuan Wang, Kaibo Zhang, Raman Arora

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We support our theoretical findings with an empirical study on synthetic and real-world data. We validate our theoretical results with experiments on both synthetic and real-world datasets.
Researcher Affiliation Academia Department of Computer Science, Johns Hopkins University, Baltimore, USA.
Pseudocode Yes Algorithm 1 Gradient Descent-based Adversarial Training
Open Source Code No The paper does not provide any statements or links indicating that open-source code for the described methodology is available.
Open Datasets Yes In this section, we present a simple empirical study on a synthetic dataset to support our theoretical results. We follow the generative model in Section 2 to synthesize a dataset with independent label flips... We observe the same trends on MNIST dataset even though the data generative assumptions are no longer valid.
Dataset Splits No For the synthetic dataset, the paper states, "generate n = 100 training samples and 2K test samples." For MNIST, it mentions "12,665 training examples and 2,115 test examples." No separate validation split is explicitly provided or mentioned.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions "PyTorch" but does not specify a version number or list other software dependencies with their respective versions. The text says: "We use the default initialization in Py Torch and train the network..."
Experiment Setup Yes We train a two-layer Re LU network with width 1K. We use the default initialization in Py Torch and train the network applying full-batch gradient-descent based adversarial training using logistic loss for 1K iterations. We use PGD attack to generate adversarial examples with attack strength α/ µ and attack stepsize α/(5 µ ) for 20 iterations. The outer minimization is trained using an initial learning rate of 0.1 with decay by 10 after training for every 500 iterations.