Learning Generalized Policy Automata for Relational Stochastic Shortest Path Problems

Authors: Rushang Karia, Rashmeet Kaur Nayyar, Siddharth Srivastava

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical analysis shows that this approach effectively learns broadly applicable policy knowledge in a few-shot fashion and signi cantly outperforms state-of-the-art SSP solvers on test problems whose object counts are far greater than those used during training.
Researcher Affiliation Academia Rushang Karia, Rashmeet Kaur Nayyar, Siddharth Srivastava School of Computing and Augmented Intelligence Arizona State University Tempe, AZ 85281 U.S.A. {Rushang.Karia,rmnayyar,siddharths}@asu.edu
Pseudocode Yes Algorithm 1 GPA acceleration for SSPs
Open Source Code Yes Our source code is available at https://github.com/AAIR-lab/GRAPL
Open Datasets Yes We utilized problem generators from the IPC and IPPC suites and those in Shah et al. (2020) for generating the training and test problems for all domains.
Dataset Splits No The paper describes using a small set of training instances and a separate test set, but it does not mention a distinct validation set or specific train/validation/test splits with percentages or sample counts for a single dataset.
Hardware Specification Yes We ran our experiments on a cluster of Intel Xeon E5-2680 v4 CPUs running at 2.4 GHz with 16 Gi B of RAM.
Software Dependencies No The paper states, "Our implementation is a Python adaptation of mdp-lib," but does not specify version numbers for Python, mdp-lib, or any other software libraries or solvers used.
Experiment Setup Yes We xed the time and memory limit for each problem to 7200 seconds and 16 Gi B respectively. ... Additional information of our empirical setup such as problem parameters, hyperparameters used for con guring baselines, etc., is included in Appendix B.