Efficient Verification of ReLU-Based Neural Networks via Dependency Analysis

Authors: Elena Botoeva, Panagiotis Kouvaros, Jan Kronqvist, Alessio Lomuscio, Ruth Misener3291-3299

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present Venus, the resulting verification toolkit, and evaluate it on the ACAS collision avoidance networks and models trained on the MNIST and CIFAR-10 datasets. The experimental results obtained indicate considerable gains over the present state-of-the-art tools.
Researcher Affiliation Academia Elena Botoeva, Panagiotis Kouvaros, Jan Kronqvist, Alessio Lomuscio, Ruth Misener Department of Computing, Imperial College London, UK {e.botoeva, p.kouvaros, j.kronqvist, a.lomuscio, r.misener}@imperial.ac.uk
Pseudocode Yes Algorithm 1 The dependency analysis procedure. Algorithm 2 The verification procedure. Algorithm 3 The splitting procedure.
Open Source Code Yes In this section we introduce Venus (Venus 2019), a verification toolkit that implements the dependency analysis procedure and augments it with symbolic interval arithmetic and domain splitting techniques. [Venus 2019] refers to https://vas.doc.ic.ac.uk/software/neural
Open Datasets Yes We used the most commonly used benchmarks in the context of FFNNs verification: ACAS Xu (Julian et al. 2016)... MNIST (Le Cun, Cortes, and Burges 1998)... CIFAR-10 (Krizhevsky, Nair, and Hinton 2014)...
Dataset Splits No The paper mentions models trained on MNIST and CIFAR-10 datasets, and verification on '100 randomly selected images', but does not provide specific train/validation/test dataset splits or their percentages/counts needed to reproduce the data partitioning for model training.
Hardware Specification Yes All experiments were carried out on an Intel Core i7-7700K (4 cores) equipped with 16GB RAM, running Ubuntu 18.04.
Software Dependencies Yes Venus is implemented in Python 3.7 and relies on Gurobi 8.1 for the MILP backend.
Experiment Setup Yes For the experiments, Venus was run with 2 splitters, 4 workers, the stability-ratio cutoff set to 0.7, the depth discount set to 20 and with the dependency analyser turned off... For the experiments, we ran Venus with 2 splitters, 2 workers, the stability-ratio set to 0.4, the depth discount set to 4, and the dependency analyser turned on... Each verification query had a local timeout of 1 hour.