Neural Model Checking

Authors: Mirco Giacobbe, Daniel Kroening, Abhinandan Pal, Michael Tautschnig

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally demonstrate that our method outperforms the state-of-the-art academic and commercial model checkers on a set of standard hardware designs written in System Verilog. We have assessed the effectiveness of our method across 194 standard hardware model checking problems written in System Verilog and compared our results with the state-of-the-art academic hardware model checkers ABC and nu Xmv [24, 27], and two commercial counterparts.
Researcher Affiliation Collaboration Mirco Giacobbe University of Birmingham, UK Daniel Kroening Amazon Web Services, USA Abhinandan Pal University of Birmingham, UK Michael Tautschnig Amazon Web Services, USA and Queen Mary University of London, UK
Pseudocode No The paper describes methods through text and mathematical formulas but does not include explicit 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes We have developed a prototype tool for neural model checking2, utilising Spot 2.11.6 [44] to generate the automaton A Φ from an LTL specification Φ. 2https://github.com/aiverification/neuralmc Alongside the paper, we include a zip file in accordance with Neur IPS guidelines. This file contains benchmarks, scripts for running our experiments, and a README.md file that offers detailed instructions on how to reproduce our experiments.
Open Datasets Yes We examine 194 verification tasks derived from ten parameterised hardware designs, detailed in Appendix B. Alongside the paper, we include a zip file in accordance with Neur IPS guidelines. This file contains benchmarks, scripts for running our experiments, and a README.md file that offers detailed instructions on how to reproduce our experiments.
Dataset Splits No The paper describes a synthetic dataset generated from random executions for training, but it does not specify explicit training/validation/test dataset splits with percentages or fixed sample counts.
Hardware Specification Yes Evaluations were conducted on an Intel Xeon 2.5 GHz processor with eight threads and 32 GB of RAM running Ubuntu 20.04.
Software Dependencies Yes We have built a prototype that integrates Py Torch, the bounded model checker EBMC, the LTLto-automata translator Spot, the System Verilog simulator Verilator, and the satisfiability solver Bitwuzla [44, 76, 80, 88]. Spot 2.11.6 [44], Verilator version 5.022 [88], Py Torch 2.2.2, EBMC 5.2 [76], Bitwuzla 0.6.0 SMT solver [80]. running Ubuntu 20.04.
Experiment Setup Yes Hyper-parameters We instantiate the architecture described in Section 3 and illustrated in Figure 3, employing two hidden layers containing 8 and 5 neurons. The normalisation layer scales the input values to the range [0, 100]. We train with the Adam W optimiser [70], typically setting the learning rate to 0.1 or selecting from 0.2, 0.05, 0.03, 0.01 if adjusted, with a fixed weight decay of 0.01, demonstrating minimal hyperparameter tuning for training.