reproducibilityindex.ai

Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

Authors: Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments aim to study the effect that RCAD has on test accuracy, both in comparison to and in addition to existing regularization techniques and adversarial methods, so as to understand the degree to which its effect is complementary to existing methods. Benchmark datasets. We use six image classification benchmarks.
Researcher Affiliation	Academia	Amrith Setlur1, Benjamin Eysenbach1 Virginia Smith1 Sergey Levine2 1 Carnegie Mellon University 2 UC Berkeley
Pseudocode	Yes	RCAD: Reducing Confidence along Adversarial Directions def rcad_loss(x, y, α, λ): loss = model(x).log_prob(y) x_adv = x + α * loss.grad(x) entropy = model(x_adv).entropy() return loss λ * entropy
Open Source Code	Yes	2Code for this work can be found at https://github.com/ars22/RCAD-regularizer.
Open Datasets	Yes	Benchmark datasets. We use six image classification benchmarks. In addition to CIFAR-10, CIFAR-100 [30], SVHN [41] and Tiny Imagenet [34], we modify CIFAR-100 by randomly subsampling 2,000 and 10,000 training examples (from the original 50,000) to create CIFAR-100-2k and CIFAR-100-10k.
Dataset Splits	Yes	If the validation split is not provided by the benchmark, we hold out 10% of our training examples for validation.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instances with specs) were provided. The paper only mentions using Res Net-18 and Wide Res Net 28-10 backbones for training.
Software Dependencies	No	No specific ancillary software details with version numbers were provided. The paper mentions using SGD and standard deep learning models like Res Net-18.
Experiment Setup	Yes	Unless specified otherwise, we train all methods using the Res Net-18 [19] backbone, and to accelerate training loss convergence we clip gradients in the l2 norm (at 1.0) [71, 18]. We train all models for 200 epochs and use SGD with an initial learning rate of 0.1 and Nesterov momentum of 0.9, and decay the learning rate by a factor of 0.1 at epochs 100, 150 and 180 [10]. We select the model checkpoint corresponding to the epoch with the best accuracy on validation samples as the final model representing a given training method. For all datasets (except CIFAR-100-2k and CIFAR-100-10k for which we used 32 and 64 respectively) the methods were trained with a batch size of 128. For details on algorithm specific hyperparameter choices refer to Appendix B.