Learning perturbation sets for robust machine learning

Authors: Eric Wong, J Zico Kolter

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. We highlight the versatility of our approach using CVAEs with an array of experiments, where we vary the complexity and scale of the datasets, perturbations, and downstream tasks.
Researcher Affiliation Collaboration Eric Wong Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA wongeric@mit.edu J. Zico Kolter Computer Science Department Carnegie Mellon University and Bosch Center for Artificial Intelligence Pittsburgh, PA 15213, USA zkolter@cs.cmu.edu
Pseudocode Yes Algorithm 1 Given a dataset D, perform an epoch of adversarial training with a learned perturbation set given by a generator g and a radius ϵ with step size γ, using a PGD adversary with T steps and step size α. Algorithm 2 Given a datapoint x, pseudocode for certification and prediction for a classifier which has been smoothed over a learned perturbation set given by a generator g and a radius ϵ using a noise level σ with probability at least 1 α.
Open Source Code Yes All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning.
Open Datasets Yes We first demonstrate how the approach can learn basic ℓ and rotation-translation-skew (RTS) perturbations (Jaderberg et al., 2015) in the MNIST setting. We next look at a more difficult setting which can not be mathematically defined, and learn a perturbation set for common image corruptions on CIFAR10 (Hendrycks & Dietterich, 2019). In our final setting, we learn a perturbation set that captures real-world variations in lighting using a multi-illumination dataset of scenes captured in the wild (Murmann et al., 2019).
Dataset Splits Yes We use the standard MNIST dataset consisting of 60,000 training examples with 1,000 examples randomly set aside for validation purposes, and 10,000 examples in the test set. We generate a validation set from the training set by randomly setting aside 1/50 of the CIFAR10 training set and all of their corresponding corrupted variants.
Hardware Specification Yes These experiments were run on a single Ge Force RTX 2080 Ti graphics card, with the longest perturbation set taking 1 hour to train. These experiments were run on a single Quadro RTX 8000 graphics card, taking 16 hours to train the CVAE and 12 hours to run adversarial training.
Software Dependencies No The paper mentions using the Adam optimizer, which is a software component, but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes Both networks are trained for 20 epochs, with step size following a piece-wise linear schedule of [0, 0.001, 0.0005, 0.0001] over epochs [0, 10, 15, 20] with the Adam optimizer using batch size 128. Training is done for 1000 epochs using a cyclic learning rate (Smith, 2017), peaking at 0.001 on the 400th epoch using the Adam optimizer (Kingma & Ba, 2014) with momentum 0.9 and batch size 128.