Policy Optimization with Linear Temporal Logic Constraints

Authors: Cameron Voloshin, Hoang Le, Swarat Chaudhuri, Yisong Yue

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our algorithm can achieve strong performance even in low-sample regimes. In summary, the contributions of this paper are: 3. We empirically validate using both infinite- and indefinite-horizon problems, and with composite specifications such as collecting items while avoiding enemies. We find that our method enjoys strong performance, often requiring many fewer samples than our worst-case guarantees.
Researcher Affiliation Collaboration Cameron Voloshin Caltech Hoang M. Le Argo AI Swarat Chaudhuri UT Austin Yisong Yue Argo AI Caltech
Pseudocode Yes Algorithm 1 LTL Constrained Planning (LCP) Algorithm 2 Plan Recurrent (PR) Algorithm 3 Plan Transient (PT)
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We perform experiments in two domains: (1) Pacman domain where an agent find food and indefinitely avoids a ghost; (2) discretized version of mountain car (MC) [14] where the agent must reach the flag. [14] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym, 2016.
Dataset Splits No The paper mentions the use of specific domains (Pacman, Mountain Car) for experiments but does not provide details on how the data was split into training, validation, or test sets, nor does it specify percentages or sample counts for these splits.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory.
Software Dependencies No The paper mentions the use of 'OpenAI Gym' as a source for the Mountain Car environment but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries).
Experiment Setup No The paper discusses the experimental domains and compares results with baselines, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings.