Benign Overfitting in Adversarial Training of Neural Networks
Authors: Yunjuan Wang, Kaibo Zhang, Raman Arora
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We support our theoretical findings with an empirical study on synthetic and real-world data. We validate our theoretical results with experiments on both synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Department of Computer Science, Johns Hopkins University, Baltimore, USA. |
| Pseudocode | Yes | Algorithm 1 Gradient Descent-based Adversarial Training |
| Open Source Code | No | The paper does not provide any statements or links indicating that open-source code for the described methodology is available. |
| Open Datasets | Yes | In this section, we present a simple empirical study on a synthetic dataset to support our theoretical results. We follow the generative model in Section 2 to synthesize a dataset with independent label flips... We observe the same trends on MNIST dataset even though the data generative assumptions are no longer valid. |
| Dataset Splits | No | For the synthetic dataset, the paper states, "generate n = 100 training samples and 2K test samples." For MNIST, it mentions "12,665 training examples and 2,115 test examples." No separate validation split is explicitly provided or mentioned. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions "PyTorch" but does not specify a version number or list other software dependencies with their respective versions. The text says: "We use the default initialization in Py Torch and train the network..." |
| Experiment Setup | Yes | We train a two-layer Re LU network with width 1K. We use the default initialization in Py Torch and train the network applying full-batch gradient-descent based adversarial training using logistic loss for 1K iterations. We use PGD attack to generate adversarial examples with attack strength α/ µ and attack stepsize α/(5 µ ) for 20 iterations. The outer minimization is trained using an initial learning rate of 0.1 with decay by 10 after training for every 500 iterations. |