reproducibilityindex.ai

Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons

Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that using ℓ -dist nets as basic building blocks, we consistently achieve state-of-the-art performance on commonly used datasets: 93.09% certiﬁed accuracy on MNIST (ϵ = 0.3), 35.42% on CIFAR-10 (ϵ = 8/255) and 16.31% on Tiny Image Net (ϵ = 1/255).
Researcher Affiliation	Collaboration	1Key Laboratory of Machine Perception, MOE, School of EECS, Peking University 2Department of Electrical and Computer Engineering, Princeton University 3Zhongguancun Haihua Institute for Frontier Information Technology 4Department of Computer Science, Princeton University 5Microsoft Research 6Center for Data Science, Peking University.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Finally, we provide all the implementation details and codes at https://github.com/zbh2047/L inf-dist-net.
Open Datasets	Yes	We train our models on four popular benchmark datasets: MNIST, Fashion-MNIST, CIFAR-10 and Tiny Imagenet.
Dataset Splits	No	The paper mentions training and testing procedures and some data augmentation strategies, but it does not specify explicit training/validation/test splits by percentage or count. It refers to 'training set' and 'test set' but not a distinct 'validation set' in terms of splitting ratios.
Hardware Specification	Yes	All these experiments are run on a single NVIDIA-RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Adam optimizer with hyper-parameters β1 = 0.9, β2 = 0.99 and ϵ = 10 10', but it does not specify any software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	In all experiments, we train ℓ -dist Net and ℓ -dist Net+MLP using Adam optimizer with hyper-parameters β1 = 0.9, β2 = 0.99 and ϵ = 10 10. The batch size is set to 512. For data augmentation, we use random crop (padding=1) for MNIST and Fashion-MNIST, and use random crop (padding=4) and random horizontal ﬂip for CIFAR-10, following the common practice. For Tiny Image Net dataset, we use random horizontal ﬂip and crop each image to 56 56 pixels for training, and use a center crop for testing, which is the same as Xu et al. (2020a). As for the loss function, we use multi-class hinge loss for ℓ -dist Net and the IBP loss (Gowal et al., 2018) for ℓ -dist Net+MLP. The training procedure is as follows. First, we relax the ℓ dist net to ℓp-dist net by setting p = 8 and train the network for e1 epochs. Then we gradually increase p from 8 to 1000 exponentially in the next e2 epochs. Finally, we set p = and train the last e3 epochs. Here e1, e2 and e3 are hyperparameters varying from the dataset. We use lr = 0.02 in the ﬁrst e1 epochs and decease the learning rate using cosine annealing for the next e2 + e3 epochs. We use ℓp-norm weight decay for ℓ -dist nets and ℓ2-norm weight decay for the MLP with coefﬁcient λ = 0.005. All these explicitly speciﬁed hyper-parameters are kept ﬁxed across different architectures and datasets. For ℓ -dist Net+MLP training, we use the same linear warmup strategy for hyper-parameter ϵtrain in Gowal et al. (2018); Zhang et al. (2020b). See Appendix D (Table 6) for details of training conﬁguration and hyper-parameters.