Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neuron Similarity-Based Neural Network Verification via Abstraction and Refinement
Authors: Yuehao Liu, Yansong Dong, Liang Zhao, Wensheng Wang, Cong Tian
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that ARVerifier significantly reduces network size and yields verification time reductions by 11.61%, 18.70%, and 12.20% compared to α, β-CROWN, Verinet, and Marabou, respectively. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Technology, Xidian University 2Beijing Sunwise Information Technology Ltd EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 specifies the general neural network verification method based on CEGAR. Given a verification problem N, P, Q , the algorithm returns SAFE if the safety property is satisfied, or UNSAFE along with a counterexample cex. |
| Open Source Code | No | The paper states, "We have implemented this method as a tool named ARVerifier and integrated it with three state-of-the-art verification tools for evaluation on ACAS Xu and MNIST benchmarks." While it mentions implementation as a tool, there is no explicit statement about making the source code publicly available or providing a link to a repository. |
| Open Datasets | Yes | The neural networks considered in the experiments are from the fully connected benchmarks used in the international verification of neural networks competition (VNN-COMP) [Brix et al., 2024; Brix et al., 2023], specifically the ACAS Xu and MNIST datasets. |
| Dataset Splits | No | The paper describes how properties were tested on the ACAS Xu and MNIST datasets (e.g., "testing 4 properties across all 45 networks and 6 on a single network" for ACAS Xu, and "testing 25 images with l perturbations of 0.02 and 0.05" for MNIST). However, it does not specify any training/test/validation splits for the datasets themselves, as the paper focuses on verification of pre-existing DNNs, not their training. |
| Hardware Specification | Yes | Experiments are performed on a 64-bit Ubuntu 18.04 platform equipped with 64 GB of RAM and an Intel i77700 quad-core processor. ... α, β-CROWN is executed with an additional NVIDIA TITAN RTX GPU with 24 GB memory. |
| Software Dependencies | Yes | All verification tools are implemented in Python 3.8. For the tools involving MILP solving, α, β-CROWN and Marabou use Gurobi 9.1, while Verinet uses Xpress 9.0 as the backend solver. |
| Experiment Setup | Yes | Besides, we impose a 30-minute timeout for each verification problem, while maintaining default values for other parameters. |