Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Differentiable Abstract Interpretation for Provably Robust Neural Networks
Authors: Matthew Mirman, Timon Gehr, Martin Vechev
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implemented our approach in a system called DIFFAI2 and evaluated it extensively across a range of datasets and network architectures. We demonstrate that DIFFAI training scales to networks larger than those of prior work and that networks trained with DIFFAI are more provably robust than those trained with state-of-the-art defenses. |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich, Switzerland. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | A complete implementation of the method in a system called DIFFAI1 together with an extensive evaluation on a range of datasets and architectures. Our results show that DIFFAI improves provability of robustness and scales to large networks (Section 6). 1Available at: http://diffai.ethz.ch |
| Open Datasets | Yes | We evaluate DIFFAI on four different datasets: MNIST, CIFAR10, Fashion MNIST (F-MNIST) and SVHN. |
| Dataset Splits | No | The paper does not explicitly state the dataset splits for training, validation, and testing. It mentions using a 'test set' but no specific percentages or counts for all splits. |
| Hardware Specification | Yes | We ran on a Ge Force GTX 1080 Ti, and a K80. |
| Software Dependencies | No | Our system is built on top of PyTorch (Paszke et al., 2017). A specific version for PyTorch or other dependencies is not mentioned. |
| Experiment Setup | Yes | For all of our experiments, we used the Adam Optimizer (Kingma & Ba, 2014), with the default parameters and a learning rate (lr) of 0.0001, unless otherwise specified. Additionally, we used norm-clipping on the weights after every batch with a max-norm of 10,000. For training we use a batch size of 500. |