LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness

Authors: Xiaojun Xu, Linyi Li, Bo Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct comprehensive evaluations for LOT under different settings. We show that LOT significantly outperforms baselines regarding deterministic ℓ2 certified robustness, and scales to deeper neural networks. We conduct comprehensive experiments to evaluate our approach
Researcher Affiliation Academia Xiaojun Xu Linyi Li Bo Li University of Illinois Urbana-Champaign {xiaojun3, linyi2, lbo}@illinois.edu
Pseudocode Yes The detailed algorithm is shown in Appendix B.
Open Source Code Yes 1The code is available at https://github.com/AI-secure/Layerwise-Orthogonal-Training.
Open Datasets Yes We focus on the CIFAR-10 and CIFAR-100 datasets and In semi-supervised learning, we use the 500K data introduced in [4] as the unlabelled dataset.
Dataset Splits No The paper mentions training on CIFAR-10 and CIFAR-100 datasets and evaluating on a 'testing set', but does not explicitly provide details about the training/validation/test dataset splits (e.g., percentages or sample counts for each split).
Hardware Specification Yes For the evaluation time comparison, we show the runtime taken to do a full pass on the testing set evaluated on an NVIDIA RTX A6000 GPU.
Software Dependencies No The paper describes its methods and training parameters but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes To train the LOT network, we will train the model for 200 epochs using a momentum SGD optimizer with an initial learning rate 0.1 and decay by 0.1 at the 100-th and 150-th epochs. We use Newton s iteration with 10 steps which we observe is enough for convergence (see Appendix E.4). When CReg loss is applied, we use γ = 0.5; when HH activation is applied, we use the version of order 1. We add the residual connection with a fixed λ = 0.5 for LOT; for SOC, we use their original version, as we observe that residual connections even hurt their performance (see discussions in Section 6.3).