Fast Training of Provably Robust Neural Networks by SingleProp

Authors: Akhilan Boopathy, Lily Weng, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Luca Daniel6803-6811

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses. Extensive experiments demonstrate Single Prop achieves superior computational efficiency and comparable certified accuracies compared to the current fastest certified robust training method IBP (Gowal et al. 2019).
Researcher Affiliation Collaboration Akhilan Boopathy1, Lily Weng 1, Sijia Liu2, Pin-Yu Chen2, Gaoyuan Zhang2, Luca Daniel1 1 Massachusetts Institute of Technology 2 MIT-IBM Watson AI Lab, IBM Research
Pseudocode Yes See Appendix B Algorithm 1 for the full procedure.
Open Source Code No The paper states: 'We use the official code released by (Gowal et al. 2019) to implement IBP training.' This refers to a third-party baseline's code and does not indicate that the authors' own code for the Single Prop method is publicly available or released.
Open Datasets Yes Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses. We directly use the code provided for IBP (Gowal et al. 2019) and use the same CNN architectures (small, medium, large and wide) on MNIST and CIFAR-10 datasets.
Dataset Splits Yes To ensure consistent performance without extensive hyperparameter tuning, we propose an adaptive method of selecting the regularization hyperparameter λ using a validation set. The schedule of learning rates is tuned for each method individually using a validation set.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. It only discusses the computational efficiency of the methods in terms of time and memory overhead.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x'). It mentions using 'the official code provided for IBP (Gowal et al. 2019)' but does not detail the software stack or versions used for their own implementation.
Experiment Setup Yes The MNIST networks are trained for 100 epochs each with a batch size of 100 while the CIFAR networks are trained for 350 epochs each with a batch size of 50. We use the standard values of ϵ = 0.3 for MNIST and ϵ = 8/255 as the training target perturbation size ϵtrain. Following (Gowal et al. 2019), the schedule of ϵ starts at 0 for a warmup period (2000 training steps on MNIST, 5000 training steps on CIFAR), followed by a linear increase to the desired target perturbation size (10000 training steps on MNIST, 50000 training steps on CIFAR), after which ϵ is fixed at the target level. We propose an adaptive method of selecting the regularization hyperparameter λ.