Neural Model Checking
Authors: Mirco Giacobbe, Daniel Kroening, Abhinandan Pal, Michael Tautschnig
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally demonstrate that our method outperforms the state-of-the-art academic and commercial model checkers on a set of standard hardware designs written in System Verilog. We have assessed the effectiveness of our method across 194 standard hardware model checking problems written in System Verilog and compared our results with the state-of-the-art academic hardware model checkers ABC and nu Xmv [24, 27], and two commercial counterparts. |
| Researcher Affiliation | Collaboration | Mirco Giacobbe University of Birmingham, UK Daniel Kroening Amazon Web Services, USA Abhinandan Pal University of Birmingham, UK Michael Tautschnig Amazon Web Services, USA and Queen Mary University of London, UK |
| Pseudocode | No | The paper describes methods through text and mathematical formulas but does not include explicit 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | We have developed a prototype tool for neural model checking2, utilising Spot 2.11.6 [44] to generate the automaton A Φ from an LTL specification Φ. 2https://github.com/aiverification/neuralmc Alongside the paper, we include a zip file in accordance with Neur IPS guidelines. This file contains benchmarks, scripts for running our experiments, and a README.md file that offers detailed instructions on how to reproduce our experiments. |
| Open Datasets | Yes | We examine 194 verification tasks derived from ten parameterised hardware designs, detailed in Appendix B. Alongside the paper, we include a zip file in accordance with Neur IPS guidelines. This file contains benchmarks, scripts for running our experiments, and a README.md file that offers detailed instructions on how to reproduce our experiments. |
| Dataset Splits | No | The paper describes a synthetic dataset generated from random executions for training, but it does not specify explicit training/validation/test dataset splits with percentages or fixed sample counts. |
| Hardware Specification | Yes | Evaluations were conducted on an Intel Xeon 2.5 GHz processor with eight threads and 32 GB of RAM running Ubuntu 20.04. |
| Software Dependencies | Yes | We have built a prototype that integrates Py Torch, the bounded model checker EBMC, the LTLto-automata translator Spot, the System Verilog simulator Verilator, and the satisfiability solver Bitwuzla [44, 76, 80, 88]. Spot 2.11.6 [44], Verilator version 5.022 [88], Py Torch 2.2.2, EBMC 5.2 [76], Bitwuzla 0.6.0 SMT solver [80]. running Ubuntu 20.04. |
| Experiment Setup | Yes | Hyper-parameters We instantiate the architecture described in Section 3 and illustrated in Figure 3, employing two hidden layers containing 8 and 5 neurons. The normalisation layer scales the input values to the range [0, 100]. We train with the Adam W optimiser [70], typically setting the learning rate to 0.1 or selecting from 0.2, 0.05, 0.03, 0.01 if adjusted, with a fixed weight decay of 0.01, demonstrating minimal hyperparameter tuning for training. |