Understanding Certified Training with Interval Bound Propagation

Authors: Yuhao Mao, Mark Niklas Mueller, Marc Fischer, Martin Vechev

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we show how these results on DLNs transfer to ReLU networks, before conducting an extensive empirical study, (i) confirming this transferability and yielding state-of-the-art certified accuracy..." and Section 4 "EMPIRICAL EVALUATION ANALYSIS".
Researcher Affiliation Academia Department of Computer Science, ETH Zürich, Swizterland {yuhao.mao, mark.mueller, marc.fischer, martin.vechev}@inf.ethz.ch
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code Yes We publish our code, trained models, and detailed instructions on how to reproduce our results at https://github.com/eth-sri/ibp-propagation-tightness.
Open Datasets Yes We use the MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for our experiments. Both are open-source and freely available.
Dataset Splits Yes We use the MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for our experiments. Both are open-source and freely available.
Hardware Specification No The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions providing code but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) in the text.
Experiment Setup Yes Specifically, for MNIST, the first 20 epochs are used for ϵ-scheduling, increasing ϵ smoothly from 0 to the target value. Then, we train an additional 50 epochs with two learning rate decays of 0.2 at epochs 50 and 60, respectively. For CIFAR-10, we use 80 epochs for ϵ-annealing, after training models with standard training for 1 epoch. We continue training for 80 further epochs with two learning rate decays of 0.2 at epochs 120 and 140, respectively. The initial learning rate is 5e-3 and the gradients are clipped to an L2 norm of at most 10.0 before every step.