LTL2Action: Generalizing LTL Instructions for Multi-Task RL

Authors: Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila A. Mcilraith

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on discrete and continuous domains target combinatorial task sets of up to 1039 unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions.
Researcher Affiliation Academia 1Department of Computer Science, University of Toronto 2Vector Institute for Artificial Intelligence 3Schwartz Reisman Institute for Technology and Society.
Pseudocode No The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our code and videos of our agents are available at github.com/LTL2Action/LTL2Action.
Open Datasets No The paper mentions using 'Mini Grid (Chevalier Boisvert et al., 2018)' and 'Open AI’s Safety Gym (Ray et al., 2019)' as environments and describes 'Letter World' as similar to 'Andreas et al. 2017'. While these are well-known environments for RL, the paper does not provide explicit access information (link, DOI, specific citation with author/year for the *dataset* if it were a static one) for a publicly available dataset in the conventional sense of a fixed data split.
Dataset Splits No The paper describes how tasks are sampled ('sampled i.i.d. from a large set of possible tasks Φ') and discusses evaluation and generalization, but it does not specify explicit train/validation/test data splits with percentages or sample counts for reproduction.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions using 'Spot (Duret-Lutz et al., 2016)' for LTL simplification and 'Proximal Policy Optimization (PPO) (Schulman et al., 2017)' for the RL algorithm, but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Details about neural network architectures and PPO hyperparameters can be found in Appendix Sections B.2, B.3, respectively.