Online Probabilistic Goal Recognition over Nominal Models

Authors: Ramon Fraga Pereira, Mor Vered, Felipe Meneguzzi, Miquel Ramírez

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed recognition algorithms empirically in Section 5, using three benchmark domains based on the constrained Linear Quadratic Regulator (LQR) problem [Bemporad et al., 2002], with increasing dimensions of state and action spaces. We build synthetic datasets for these domains and show that the first of the proposed algorithms performs quite well and infers the correct hidden goals when on the actual transition function of the domain, gracefully degrading over nominal models, due to the sometimes poor generalisation ability of the neural networks obtained.
Researcher Affiliation Academia 1Pontifical Catholic University of Rio Grande do Sul, Brazil 2Monash University, Australia 3The University of Melbourne, Australia ramon.pereira@edu.pucrs.br, mor.vered@monash.edu felipe.meneguzzi@pucrs.br, miquel.ramirez@unimelb.edu.au
Pseudocode No The paper describes methods and algorithms in text but does not include any labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Results and Jupyter notebooks are available in https://github.com/authors-ijcai19-3244/ijcai19-paper3244-results
Open Datasets No To build the datasets and learn the system dynamics for the domains discussed previously, we generated 500 different navigation tasks i.e. pairs of states x0 and x G. We set the horizon H = 100, resulting in three different datasets with 50, 000 transitions each. To generate the trajectories for each of the tasks, we first encoded the FHOCs for each of the domains using the RDDL domain description language [Sanner, 2011].
Dataset Splits No Training was stopped for all domains after 300 epochs. We used exactly the same DNN configuration to learn the system dynamics for all of our domains.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No For recognising goals over actual models we used the implementation of the TF-PLAN planner used in [Bueno et al., 2019] that takes as input a domain model formalised in RDDL. For nominal models we used the implementation of TF-PLAN in [Wu et al., 2017] that takes as input a domain model represented as a DNN.
Experiment Setup Yes For the training stage, we configured the DNN proposed by [Say et al., 2017] to use the same hyper-parameters. Namely, 1 hidden layer, a batch size of 128 transitions, we set the learning rate to 0.01 and dropout rate to 0.1. Training was stopped for all domains after 300 epochs. We used exactly the same DNN configuration to learn the system dynamics for all of our domains. For both planners we set the learning rate to 0.01, batch size equals to 128, and the number of epochs to 300.