Optimal Exploration for Model-Based RL in Nonlinear Systems
Authors: Andrew Wagenmaker, Guanya Shi, Kevin G. Jamieson
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude with experiments demonstrating the effectiveness of our method in realistic nonlinear robotic systems1. |
| Researcher Affiliation | Academia | Andrew Wagenmaker Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, WA 98195 Guanya Shi Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 Kevin Jamieson Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, WA 98195 |
| Pseudocode | Yes | Algorithm 1 Optimal Exploration in Nonlinear Systems (informal) |
| Open Source Code | Yes | Code: https://github.com/ajwagen/nonlinear_sysid_for_control |
| Open Datasets | No | The paper uses simulated systems (drone, car, and a 1-D system example) which are internally generated, not publicly available datasets or benchmarks with specified access information. |
| Dataset Splits | No | The paper conducts experiments on simulated systems using 'episodes' of interaction. It does not provide explicit training, validation, or test dataset splits as one would for a standard machine learning dataset. |
| Hardware Specification | Yes | All experiments were run on a machine with 56 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUs, and 64GB RAM. |
| Software Dependencies | No | All code was implemented in Py Torch. However, no specific version numbers for PyTorch or Python are provided. |
| Experiment Setup | Yes | For all examples the noise is distributed as wh N(0, 0.1 I). In all cases we set γ2 = 10H (where γ2 is a bound on Eπexp[PH h=1 u h uh]), and we therefore let Πexp denote the set of all policies satisfying Eπexp[PH h=1 u h uh] γ2. |