Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Approximating Perfect Recall when Model Checking Strategic Abilities: Theory and Applications

Authors: Francesco Belardinelli, Alessio Lomuscio, Vadim Malvone, Emily Yu

JAIR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we extend MCMAS, an open-source model checker for ATL and other agent specifications, to incorporate bounded recall; we illustrate its use and present experimental results.
Researcher Affiliation Academia Francesco Belardinelli EMAIL Imperial College London, United Kingdom and Laboratoire IBISC, Universit e d Evry, France Alessio Lomuscio EMAIL Imperial College London United Kingdom Vadim Malvone EMAIL T el ecom Paris France Emily Yu EMAIL Imperial College London, United Kingdom and Johannes Kepler University Linz, Austria
Pseudocode Yes Algorithm MC(M, ϕ, n) Algorithm Transl(ϕ, v) Algorithm MC3(M, ϕ, n) Algorithm Iterative MC(M, ψ, n)
Open Source Code Yes We implemented the algorithms in Section 4 in MCMASBR MCMASBR (2021), an experimental model checker that extends the open-source verification tool MCMAS (Lomuscio et al., 2017) by supporting the bounded recall semantics introduced in Section 2... MCMASBR. MCMAS with bounded recall. 2021. http://vas.doc.ic.ac.uk/software.
Open Datasets No The paper uses examples like the 'Shell Game' and 'Simple Voting Scenario' to illustrate the model checking process. These are described as interpreted systems (models) within the paper, rather than referring to external, publicly available datasets. No specific dataset links, DOIs, or repositories for external datasets are provided.
Dataset Splits No The paper does not use external datasets in the traditional sense, but rather constructs 'interpreted systems' for model checking. Therefore, there are no training/test/validation splits relevant to empirical machine learning experiments. The evaluation focuses on how the model checker performs with varying parameters of these constructed systems (e.g., number of shells, waiting steps, bound on recall).
Hardware Specification Yes Table 1 shows the experimental results obtained with MCMASBR running on an Intel Coretm i7-2600 CPU 3.40GHz machine with 16GB RAM running Ubuntu v18.04.2 (Linux kernel v4.15).
Software Dependencies No The paper mentions 'MCMASBR' and 'MCMAS' as the tools used/extended, and the command 'python mcmas_br.py', implying Python is used. It also states the operating system 'Ubuntu v18.04.2 (Linux kernel v4.15)'. However, it does not provide specific version numbers for Python itself or any other key software libraries or dependencies (e.g., specific Python libraries, model checking frameworks besides their own, or compilers).
Experiment Setup Yes As reported in Table 1, we ran experiments with a varying bound on recall between 5 and 25, and 20 to 30 shells. In Table 1 we used a time-out of 60 minutes...