reproducibilityindex.ai

Robust Neuro-Symbolic Goal and Plan Recognition

Authors: Leonardo Amado, Ramon Fraga Pereira, Felipe Meneguzzi

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approaches in standard human-designed planning domains as well as domain models automatically learned from real-world data. Empirical experimentation shows that our approaches reliably infer goals and compute correct plans in the experimental datasets. An ablation study shows that we outperform existing approaches that rely exclusively on the domain model, or exclusively on machine learning, in problems with both noisy observations and low observability.
Researcher Affiliation	Academia	Leonardo Amado1, Ramon Fraga Pereira2,3, Felipe Meneguzzi4,1 1Pontifical Catholic University of Rio Grande do Sul, Brazil 2 University of Manchester, England, UK 3 Sapienza University of Rome, Italy 4 University of Aberdeen, Scotland, UK
Pseudocode	Yes	Algorithm 1: COMPUTESEQUENCE(I, A, Ω, G, λ) and Algorithm 2: PREDICTNEXTS(A, S, G, o) are presented as structured pseudocode blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	Hand-Crafted Domain Datasets: These datasets consist of two domains from the International Planning Competition (IPC), i.e., Blocks-World (BLOCKS) and LOGISTICS. ... Latent Space Datasets: ... We select two domains from (Asai and Fukunaga 2018): MNIST 8-puzzle and Lights-Out Digital (LODIGITAL). The MNIST 8-puzzle uses handwritten digits from the MNIST dataset as tiles.
Dataset Splits	No	The paper states: 'After computing a plan for each problem, we separate the data into a training set, containing 80 plan instances, and a test set with 20 plan instances.' It also mentions 'validation loss' during training but does not specify the percentage or absolute counts for a validation split.
Hardware Specification	Yes	We ran all experiments with a timeout of 1200 seconds per problem in a single core of a 24 core Intel Xeon v E52620 CPU @2.00GHz with 160GB of RAM and a memory limit of 8GB and an NVIDIA Titan Xp GPU.
Software Dependencies	No	The paper mentions using an 'Adam optimizer' and 'LSTM', but it does not specify any version numbers for programming languages, libraries, or specific software packages (e.g., Python version, TensorFlow/PyTorch versions).
Experiment Setup	Yes	We randomly remove observations from the test plans to obtain levels of observability of 10%, 30%, 50%, and 70%, as well as full observability (100%). ... We introduce noise by iterating through all observations of the dataset and swapping a correct observation for a noisy one with a probability of either 10% or 20%. ... We train the models with the Adam optimizer. During training, our model receives a trace as input, and outputs a prediction, which is a reconstruction of the correct state. We interrupt training after 10 consecutive epochs with no improvement in validation loss. ... ϑ is our confidence threshold of the ML model used in PPRσ h, measured as the mean value of State Acc and Rec. Acc.