LTL2Action: Generalizing LTL Instructions for Multi-Task RL
Authors: Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila A. Mcilraith
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on discrete and continuous domains target combinatorial task sets of up to 1039 unique tasks and demonstrate the strength of our approach in learning to solve (unseen) tasks, given LTL instructions. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Toronto 2Vector Institute for Artificial Intelligence 3Schwartz Reisman Institute for Technology and Society. |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Our code and videos of our agents are available at github.com/LTL2Action/LTL2Action. |
| Open Datasets | No | The paper mentions using 'Mini Grid (Chevalier Boisvert et al., 2018)' and 'Open AI’s Safety Gym (Ray et al., 2019)' as environments and describes 'Letter World' as similar to 'Andreas et al. 2017'. While these are well-known environments for RL, the paper does not provide explicit access information (link, DOI, specific citation with author/year for the *dataset* if it were a static one) for a publicly available dataset in the conventional sense of a fixed data split. |
| Dataset Splits | No | The paper describes how tasks are sampled ('sampled i.i.d. from a large set of possible tasks Φ') and discusses evaluation and generalization, but it does not specify explicit train/validation/test data splits with percentages or sample counts for reproduction. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions using 'Spot (Duret-Lutz et al., 2016)' for LTL simplification and 'Proximal Policy Optimization (PPO) (Schulman et al., 2017)' for the RL algorithm, but it does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Details about neural network architectures and PPO hyperparameters can be found in Appendix Sections B.2, B.3, respectively. |