Certified Policy Verification and Synthesis for MDPs under Distributional Reach-Avoidance Properties

Authors: S. Akshay, Krishnendu Chatterjee, Tobias Meggendorfer, Đorđe Žikelić

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation demonstrates the ability of our method to solve several non-trivial examples, including a multi-agent robot-swarm model, to synthesize certified policies and to certify existing policies.
Researcher Affiliation Academia 1Indian Institute of Technology Bombay, India 2Institute of Science and Technology Austria (ISTA), Austria 3Lancaster University Leipzig, Germany 4Singapore Management University, Singapore
Pseudocode No The paper describes algorithms (e.g., "Algorithm outline. The algorithm employs a template-based synthesis approach and proceeds in three steps.") but does not provide formal pseudocode blocks or figures.
Open Source Code Yes Our implementation is publicly available at https://zenodo.org/records/11082466.
Open Datasets No The paper describes models/benchmarks used (e.g., "robot swarms in gridworld environments", "Insulin, a pharmacokinetics system", "Page Rank"), often referencing external papers, but it does not state that these are publicly available datasets or provide direct links to them as datasets for experimental use. The focus is on the models and formal methods, not on data-driven training.
Dataset Splits No The paper does not describe specific training, validation, or test dataset splits. The work is in formal verification/synthesis, not typically involving data splits in the machine learning sense.
Hardware Specification Yes Our experiments were executed on consumer hardware (AMD Ryzen 3600 CPU with 16 GB RAM).
Software Dependencies Yes We implemented a prototype of our method in Python 3, using Sym Py [Meurer et al., 2017] for symbolic expressions and Py SMT [Gario and Micheli, 2015] to manage SMT solvers. We employ Yices 2.6 [Dutertre, 2014] as solving backend.
Experiment Setup Yes For all examples, we find the template size NI = 1 to be sufficient. Input. The algorithm takes as input an MDP M = (S, Act, δ) together with affine sets of initial distributions Init, target distributions T and safe distributions H. It also takes as input the template size parameter NI.