Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Formal Explanations of Neural Network Policies for Planning

Authors: Renee Selvey, Alban Grastien, Sylvie Thiébaux

IJCAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present experimental results of our implementation of this approach for ASNet policies for classical planning domains.
Researcher Affiliation Academia 1School of Computing, The Australian National University 2LAAS-CNRS, ANITI, Universit e de Toulouse
Pseudocode Yes Algorithm 1 Computing a minimal explanation for a sequence of decisions.
Open Source Code Yes For reproducibility, our repository https://github.com/Renee Selvey/policy-explanations provides our algorithm implementation, benchmarks used, learnt policies, and the scripts to learn them and run the experiments.
Open Datasets Yes We took all deterministic domains and training instances from the code distributions of [Toyer et al., 2020] and [Steinmetz et al., 2022].
Dataset Splits No The paper mentions generating problems for evaluation but does not specify a validation split or its details for the experimental data used in this paper.
Hardware Specification Yes All experiments were run on a machine with an AMD Ryzen Threadripper 3990X CPU, with 64 cores/128 threads, a clock speed of 2.9 GHz base, 4.3 GHz max boost, and 128 GB of memory of which we used 64 GB.
Software Dependencies Yes Gurobi version 9.1.2 is the MIP solver used for the experiments.
Experiment Setup Yes To ensure the model is accurate enough for our experiments, we set the integer feasibility tolerance (Int Feas Tol) to 10 9 and the error for function approximations (Func Piece Error) to 10 6. ... Each explanation problem was run with a time limit of 3h, except for Gripper for which the timeout was 4h.