Certifying Some Distributional Robustness with Principled Adversarial Training
Authors: Aman Sinha, Hongseok Namkoong, John Duchi
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally verify our results in Section 4 and show that we match or achieve state-of-the-art performance on a variety of adversarial attacks.4 EXPERIMENTS Our technique for distributionally robust optimization with adversarial training extends beyond supervised learning. To that end, we present empirical evaluations on supervised and reinforcement learning tasks where we compare performance with empirical risk minimization (ERM) and, where appropriate, models trained with the fast-gradient method (3) (FGM) (Goodfellow et al., 2015), its iterated variant (IFGM) (Kurakin et al., 2016), and the projected-gradient method (PGM) (Madry et al., 2017). |
| Researcher Affiliation | Academia | Aman Sinha ,1, Hongseok Namkoong ,2, John Duchi1,3 Departments of 1Electrical Engineering, 2Management Science & Engineering, 3Statistics Stanford University Stanford, CA 94305 |
| Pseudocode | Yes | Algorithm 1 Distributionally robust optimization with adversarial training |
| Open Source Code | No | The paper does not provide any explicit statements about making its source code publicly available or links to a code repository. |
| Open Datasets | Yes | We now consider a standard benchmark training a neural network classifier on the MNIST dataset.We test our adversarial training procedure in the cart-pole environment |
| Dataset Splits | No | The paper mentions training on the MNIST dataset and testing, but does not explicitly provide the specific percentages or counts for training, validation, and test dataset splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or cloud computing instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Tensorflow' but does not specify a version number or list other software dependencies with their versions. |
| Experiment Setup | Yes | We train a small neural network with 2 hidden layers of size 4 and 2 and either all ReLU or all ELU activations between layers... For our approach we use γ = 2.The network consists of 8 × 8, 6 × 6, and 5 × 5 convolutional filter layers with ELU activations followed by a fully connected layer and softmax output. We train WRM with γ = 0.04E b Pn[‖X‖2] |