Rethinking Lipschitz Neural Networks and Certified Robustness: A Boolean Function Perspective

Authors: Bohang Zhang, Du Jiang, Di He, Liwei Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our approach is scalable, efficient, and consistently yields better certified robustness across multiple datasets and perturbation radii than prior Lipschitz networks.
Researcher Affiliation Academia Bohang Zhang1,3,4 Du Jiang1 Di He1, Liwei Wang1,2, 1National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 2Center for Data Science, Peking University 3Peng Cheng Laboratory 4Pazhou Laboratory (Huangpu) {zhangbohang,dujiang,dihe}@pku.edu.cn, wanglw@cis.pku.edu.cn
Pseudocode No The paper does not contain explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code No Our code and trained models are released at https://github.com/zbh2047/Sort Net. (Note: The authors' self-reflection in the Ethics Statement explicitly states: "Our code and models will be released once the paper is published." implying it was not publicly available at the time of publication or submission for review, thus not providing concrete access.)
Open Datasets Yes We use public benchmark datasets which can be downloaded freely. (Mentioned datasets include MNIST [33], CIFAR-10 [29], Tiny Image Net [32], and Image Net (64 64) [7])
Dataset Splits Yes In Appendix H, we also show the training variance of each setting by running 8 sets of experiments independently, and full results (including the median performance) are reported in Table 9 and 10.
Hardware Specification Yes For a fair comparison, we reproduce most of baseline methods using the official codes and report the wall-clock time under the same NVIDIA-RTX 3090 GPU.
Software Dependencies No The paper does not provide specific software dependency versions (e.g., Python 3.x, PyTorch 1.x) in the main text.
Experiment Setup Yes Sort Net model configuration. Since Sort Net generalizes the ℓ -distance net, we simply follow the same model configurations as [73] and consider two types of models. The first one is a simple Sort Net consisting of M fully-connected layers with a hidden size of 5120, which is used in MNIST and CIFAR-10. Like [73], we choose M = 5 for MNIST and M = 6 for CIFAR-10. Since Sort Net is Lipschitz, we directly apply the margin-based certification method to calculate the certified accuracy (Proposition 3.1). To achieve the best results on Image Net-like datasets, in our second type of model we consider using a composite architecture consisting of a base Sort Net backbone and a prediction head (denoted as Sort Net+MLP). Following [73], the Sort Net backbone has 5 layers with a width of 5120 neurons, which serves as a robust feature extractor. The top prediction head is a lightweight 2-layer perceptron with 512 hidden neurons (or 2048 for Image Net), which takes the robust features as input to give classification results. We also try a larger Sort Net backbone, denoted as Sort Net+MLP (2x), that has roughly four times the training cost (see Appendix E.2 for architectural details). We use the same approach as [73] to train and certify these models, i.e. by combining margin-based certification for the Sort Net backbone and interval bound propagation for the top MLP [21]. ... Following the common practice, we consider both a small perturbation radius ϵ = 0.1 and a larger one ϵ = 0.3. (for MNIST)