The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks
Authors: Luca Marzari, Davide Corsi, Ferdinando Cicalese, Alessandro Farinelli
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results on a set of safety-critical benchmarks that demonstrate the effectiveness of our approximate method and evaluate the tightness of the bound. We evaluate our algorithms on a standard benchmark, ACAS Xu, showing that Counting Pro Ve is scalable and effective also for real-world problems. Table 1: Comparison of Counting Pro Ve and exact counter on different benchmark setups. |
| Researcher Affiliation | Academia | Luca Marzari, Davide Corsi, Ferdinando Cicalese and Alessandro Farinelli Department of Computer Science, University of Verona, Verona, Italy luca.marzari@univr.it, davide.corsi@univr.it |
| Pseudocode | Yes | Algorithm 1 Counting Pro Ve |
| Open Source Code | Yes | The code used to collect the results and several additional experiments and discussions on the impact of different hyperparameters and backends for our approximation are available in the supplementary material (available here). |
| Open Datasets | Yes | A popular and successful example is ACAS Xu , an airborne collision avoidance system for aircraft based on a DNN controller, which is now a standard DNN-verification benchmark. The ACAS Xu system is an airborne collision avoidance system for aircraft considered a well-known benchmark and a standard for formal verification of DNNs [Liu et al., 2021][Katz et al., 2017][Wang et al., 2018]. |
| Dataset Splits | No | The paper mentions training and testing, but it does not specify explicit dataset splits (e.g., 80/10/10 percentages, sample counts) for training, validation, or test sets used in their experiments. It refers to ACAS Xu as a benchmark but doesn't detail how they partitioned it for their specific runs. |
| Hardware Specification | Yes | All the data are collected on a commercial PC running Ubuntu 22.04 LTS equipped with Nvidia RTX 2070 Super and an Intel i7-9700k. |
| Software Dependencies | No | The paper mentions 'Ubuntu 22.04 LTS' (an OS version) and refers to 'Ba B [Bunel et al., 2018] available on Neural Verification.jl' and 'Pro Ve [Corsi et al., 2021]' as backend tools. However, it does not specify version numbers for Neural Verification.jl, Ba B, or Pro Ve, nor for any other specific libraries or programming languages used. |
| Experiment Setup | Yes | In our experiments with Counting Pro Ve, we set β = 0.02 and t = 350 in order to obtain a correctness confidence level greater or equal to 99% (refer to Theorem 2). |