Certified Robust Neural Networks: Generalization and Corruption Resistance
Authors: Amine Bennouna, Ryan Lucas, Bart Van Parys
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate both theoretically as well as empirically the loss to enjoy a certified level of robustness against two common types of corruption data evasion and poisoning attacks while ensuring guaranteed generalization. We show through careful numerical experiments that our resulting holistic robust (HR) training procedure yields state of the art performance. |
| Researcher Affiliation | Academia | 1Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA. Correspondence to: Amine Bennouna <amineben@mit.edu>, Ryan Lucas <ryanlu@mit.edu>, Bart Van Parys <vanparys@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Holistic Robust Training Specification: Learning rate λ, number of epochs T, minibatches B, minibatch replays M, HR parameters (N, α, r). Input: Data {zi = (xi, yi)}i [n]. Initialized θ. |
| Open Source Code | Yes | A readyto-use python library implementing our algorithm is available at https://github.com/ Ryan Lucas3/HR_Neural_Networks. We release an open-source Python library (https: //github.com/Ryan Lucas3/HR_Neural_ Networks) that can be directly installed through pip. |
| Open Datasets | Yes | We conduct careful numerical experiments which illustrate the efficacy of HR training on both MNIST and CIFAR-10 datasets in all possible corruption settings: clean, affected by poisoning and/or evasive corruption. |
| Dataset Splits | Yes | The hyperparameters of each algorithm are selected based on a 70/30 train/validation split of the training data as detailed in Appendix D.2. |
| Hardware Specification | No | Table 3 reports the training time (in minutes) per epoch of ERM, PGD, and HR on various datasets and architectures. While this table shows runtimes for different architectures (Conv Net, Res Net-18, Efficient Net), it does not specify the underlying hardware (e.g., GPU/CPU models, memory) used for these experiments. |
| Software Dependencies | No | The paper mentions software components like 'ADAM optimizer', 'SGD', 'PGD algorithm', and 'torchattacks library', but it does not specify their version numbers for reproducibility. |
| Experiment Setup | Yes | For the majority of experiments in Section 6, we use the Res Net18 architecture trained with the help of the ADAM optimizer with a starting learning rate of 1 10 2 and without weight decay. We train throughout all experiments for 300 epochs... For all algorithms, we use the same PGD attack on training with ϵ = 8/255 and 10 attack steps. ... We grid search over the two TRADES hyperparameters β and ϵ. We use β {1, 6}, the two values used in their comparison of defense methods, and ϵ {0, 0.05, 0.1, 0.2}... |