reproducibilityindex.ai

Conserve-Update-Revise to Cure Generalization and Robustness Trade-off in Adversarial Training

Authors: Shruthi Gowda, Bahram Zonooz, Elahe Arani

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical findings demonstrate that selectively updating specific layers while preserving others can substantially enhance the network s learning capacity. We therefore propose CURE, a novel training framework that leverages a gradient prominence criterion to perform selective conservation, updating, and revision of weights.
Researcher Affiliation	Collaboration	1Nav Info Europe 2Eindhoven University of Technology 3Tom Tom 4Wayve
Pseudocode	Yes	Algorithm is detailed in Appendix, Section A. Algorithm 1 CURE: Conserve-Update-Revise
Open Source Code	Yes	1The code is available at: https://github.com/Neur AI-Lab/CURE.
Open Datasets	Yes	Datasets used in our study include CIFAR-10, CIFAR-100 (Krizhevsky, 2009) and SVHN.
Dataset Splits	No	The paper mentions training on datasets like CIFAR-10, CIFAR-100, and SVHN, and refers to "Adversarial Acc (validation)" in Figure 5(c), but it does not provide specific details on how the training, validation, or test splits were performed (e.g., percentages, sample counts, or explicit references to standard split methodologies).
Hardware Specification	No	The paper states that "all models are trained using the SGD optimizer" and discusses various experimental parameters, but it does not provide any specific details about the hardware used (e.g., CPU, GPU models, memory, or cloud instances).
Software Dependencies	No	The paper mentions the use of an "SGD optimizer" and "Projected Gradient Descent (PGD)" for adversarial image generation, but it does not specify the version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup	Yes	For our method, all models are trained using the SGD optimizer with a momentum of 0.9. The augmentations include basic random crop and random flip operations. Projected Gradient Descent (PGD) is used to generate adversarial images. For adversarial training, PGD with step 10 is considered with perturbation strength ϵ = 8 and step size ϵ/4. Table 9 tabulates the other hyperparameters used in our method. The learning rate is 0.1, the number of epochs is 200 and the weight decay is 5e 3. The revision rate r and decay factor d for the revision stage are set to 0.2 and 0.999 for all the experiments.