Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Understanding Certified Training with Interval Bound Propagation
Authors: Yuhao Mao, Mark Niklas Mueller, Marc Fischer, Martin Vechev
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we show how these results on DLNs transfer to ReLU networks, before conducting an extensive empirical study, (i) confirming this transferability and yielding state-of-the-art certified accuracy..." and Section 4 "EMPIRICAL EVALUATION ANALYSIS". |
| Researcher Affiliation | Academia | Department of Computer Science, ETH Zürich, Swizterland EMAIL |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. |
| Open Source Code | Yes | We publish our code, trained models, and detailed instructions on how to reproduce our results at https://github.com/eth-sri/ibp-propagation-tightness. |
| Open Datasets | Yes | We use the MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for our experiments. Both are open-source and freely available. |
| Dataset Splits | Yes | We use the MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for our experiments. Both are open-source and freely available. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions providing code but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) in the text. |
| Experiment Setup | Yes | Specifically, for MNIST, the first 20 epochs are used for ϵ-scheduling, increasing ϵ smoothly from 0 to the target value. Then, we train an additional 50 epochs with two learning rate decays of 0.2 at epochs 50 and 60, respectively. For CIFAR-10, we use 80 epochs for ϵ-annealing, after training models with standard training for 1 epoch. We continue training for 80 further epochs with two learning rate decays of 0.2 at epochs 120 and 140, respectively. The initial learning rate is 5e-3 and the gradients are clipped to an L2 norm of at most 10.0 before every step. |