Policy Optimization with Linear Temporal Logic Constraints
Authors: Cameron Voloshin, Hoang Le, Swarat Chaudhuri, Yisong Yue
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our algorithm can achieve strong performance even in low-sample regimes. In summary, the contributions of this paper are: 3. We empirically validate using both infinite- and indefinite-horizon problems, and with composite specifications such as collecting items while avoiding enemies. We find that our method enjoys strong performance, often requiring many fewer samples than our worst-case guarantees. |
| Researcher Affiliation | Collaboration | Cameron Voloshin Caltech Hoang M. Le Argo AI Swarat Chaudhuri UT Austin Yisong Yue Argo AI Caltech |
| Pseudocode | Yes | Algorithm 1 LTL Constrained Planning (LCP) Algorithm 2 Plan Recurrent (PR) Algorithm 3 Plan Transient (PT) |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We perform experiments in two domains: (1) Pacman domain where an agent find food and indefinitely avoids a ghost; (2) discretized version of mountain car (MC) [14] where the agent must reach the flag. [14] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym, 2016. |
| Dataset Splits | No | The paper mentions the use of specific domains (Pacman, Mountain Car) for experiments but does not provide details on how the data was split into training, validation, or test sets, nor does it specify percentages or sample counts for these splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions the use of 'OpenAI Gym' as a source for the Mountain Car environment but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries). |
| Experiment Setup | No | The paper discusses the experimental domains and compares results with baselines, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings. |