Adversarial Unlearning: Reducing Confidence Along Adversarial Directions
Authors: Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments aim to study the effect that RCAD has on test accuracy, both in comparison to and in addition to existing regularization techniques and adversarial methods, so as to understand the degree to which its effect is complementary to existing methods. Benchmark datasets. We use six image classification benchmarks. |
| Researcher Affiliation | Academia | Amrith Setlur1, Benjamin Eysenbach1 Virginia Smith1 Sergey Levine2 1 Carnegie Mellon University 2 UC Berkeley |
| Pseudocode | Yes | RCAD: Reducing Confidence along Adversarial Directions def rcad_loss(x, y, α, λ): loss = model(x).log_prob(y) x_adv = x + α * loss.grad(x) entropy = model(x_adv).entropy() return loss λ * entropy |
| Open Source Code | Yes | 2Code for this work can be found at https://github.com/ars22/RCAD-regularizer. |
| Open Datasets | Yes | Benchmark datasets. We use six image classification benchmarks. In addition to CIFAR-10, CIFAR-100 [30], SVHN [41] and Tiny Imagenet [34], we modify CIFAR-100 by randomly subsampling 2,000 and 10,000 training examples (from the original 50,000) to create CIFAR-100-2k and CIFAR-100-10k. |
| Dataset Splits | Yes | If the validation split is not provided by the benchmark, we hold out 10% of our training examples for validation. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or cloud instances with specs) were provided. The paper only mentions using Res Net-18 and Wide Res Net 28-10 backbones for training. |
| Software Dependencies | No | No specific ancillary software details with version numbers were provided. The paper mentions using SGD and standard deep learning models like Res Net-18. |
| Experiment Setup | Yes | Unless specified otherwise, we train all methods using the Res Net-18 [19] backbone, and to accelerate training loss convergence we clip gradients in the l2 norm (at 1.0) [71, 18]. We train all models for 200 epochs and use SGD with an initial learning rate of 0.1 and Nesterov momentum of 0.9, and decay the learning rate by a factor of 0.1 at epochs 100, 150 and 180 [10]. We select the model checkpoint corresponding to the epoch with the best accuracy on validation samples as the final model representing a given training method. For all datasets (except CIFAR-100-2k and CIFAR-100-10k for which we used 32 and 64 respectively) the methods were trained with a batch size of 128. For details on algorithm specific hyperparameter choices refer to Appendix B. |