Learning Safe Action Models with Partial Observability

Authors: Hai S. Le, Brendan Juba, Roni Stern

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we evaluated the performance of both algorithms and compared them with FAMA (Aineto, Celorrio, and Onaindia 2019) on common domains from the International Planning Competition (IPC) (Mc Dermott 2000). Our results show that PI-SAM and EPISAM often outperform FAMA in terms of the number of samples they require to learn effective action models, while still preserving our safety guarantee.
Researcher Affiliation Academia Hai S. Le1, Brendan Juba1, Roni Stern2 1Washington University in St. Louis 2Ben Gurion University of the Negev
Pseudocode Yes Algorithm 1: EPI-SAM: Learning Effects; Algorithm 2: EPI-SAM: Learning Preconditions
Open Source Code Yes The source code of the experiments is available at https://github.com/hsle/pisam learning.
Open Datasets Yes We evaluated our algorithms performance experimentally on the IPC (Mc Dermott 2000) domains listed in Table 1.
Dataset Splits No The paper does not explicitly specify train/validation/test dataset splits. It mentions using trajectories for learning and separate trajectories for evaluating empirical precision and recall, but not in terms of standard data splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes In the resulting trajectories, we masked some states using random masking with masking probability η = 0.1 and η = 0.3. ... We limited the running time of each algorithm to 60 seconds.