Online Probabilistic Goal Recognition over Nominal Models
Authors: Ramon Fraga Pereira, Mor Vered, Felipe Meneguzzi, Miquel Ramírez
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed recognition algorithms empirically in Section 5, using three benchmark domains based on the constrained Linear Quadratic Regulator (LQR) problem [Bemporad et al., 2002], with increasing dimensions of state and action spaces. We build synthetic datasets for these domains and show that the first of the proposed algorithms performs quite well and infers the correct hidden goals when on the actual transition function of the domain, gracefully degrading over nominal models, due to the sometimes poor generalisation ability of the neural networks obtained. |
| Researcher Affiliation | Academia | 1Pontifical Catholic University of Rio Grande do Sul, Brazil 2Monash University, Australia 3The University of Melbourne, Australia ramon.pereira@edu.pucrs.br, mor.vered@monash.edu felipe.meneguzzi@pucrs.br, miquel.ramirez@unimelb.edu.au |
| Pseudocode | No | The paper describes methods and algorithms in text but does not include any labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Results and Jupyter notebooks are available in https://github.com/authors-ijcai19-3244/ijcai19-paper3244-results |
| Open Datasets | No | To build the datasets and learn the system dynamics for the domains discussed previously, we generated 500 different navigation tasks i.e. pairs of states x0 and x G. We set the horizon H = 100, resulting in three different datasets with 50, 000 transitions each. To generate the trajectories for each of the tasks, we first encoded the FHOCs for each of the domains using the RDDL domain description language [Sanner, 2011]. |
| Dataset Splits | No | Training was stopped for all domains after 300 epochs. We used exactly the same DNN configuration to learn the system dynamics for all of our domains. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | For recognising goals over actual models we used the implementation of the TF-PLAN planner used in [Bueno et al., 2019] that takes as input a domain model formalised in RDDL. For nominal models we used the implementation of TF-PLAN in [Wu et al., 2017] that takes as input a domain model represented as a DNN. |
| Experiment Setup | Yes | For the training stage, we configured the DNN proposed by [Say et al., 2017] to use the same hyper-parameters. Namely, 1 hidden layer, a batch size of 128 transitions, we set the learning rate to 0.01 and dropout rate to 0.1. Training was stopped for all domains after 300 epochs. We used exactly the same DNN configuration to learn the system dynamics for all of our domains. For both planners we set the learning rate to 0.01, batch size equals to 128, and the number of epochs to 300. |