Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time
Authors: Jeremy McMahan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our paper is purely theoretical and does not include any experiments. We present a novel algorithm that efficiently computes near-optimal deterministic policies for constrained reinforcement learning (CRL) problems. Our approach combines three key ideas: (1) value-demand augmentation, (2) action-space approximate dynamic programming, and (3) time-space rounding. Our algorithm constitutes a fully polynomial-time approximation scheme (FPTAS) for any time-space recursive (TSR) cost criteria. |
| Researcher Affiliation | Academia | Jeremy Mc Mahan University of Wisconsin-Madison jmcmahan@wisc.edu |
| Pseudocode | Yes | Algorithm 1 Reduction to RL, Algorithm 2 Augmented interaction, Algorithm 3 Approx Bellman Update, Algorithm 4 Approx Solve, Algorithm 5 Approximation Scheme, Algorithm 6 Approx Solve. |
| Open Source Code | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q5) |
| Open Datasets | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q4) |
| Dataset Splits | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q4) |
| Hardware Specification | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q4) |
| Software Dependencies | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q4) |
| Experiment Setup | No | Our paper is purely theoretical and does not include any experiments. (Neur IPS Paper Checklist, Q4) |