Robust Model Based Reinforcement Learning Using $\mathcal{L}_1$ Adaptive Control
Authors: Minjun Sung, Sambhu Harimanas Karumanchi, Aditya Gahlawat, Naira Hovakimyan
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the effectiveness of the L1-MBRL scheme, we conduct extensive numerical simulations using two baseline MBRL algorithms across multiple environments, including scenarios with action or observation noise. The results unequivocally demonstrate that the L1-MBRL scheme enhances the performance of the underlying MBRL algorithms without any redesign or retuning of the L1 controller from one scenario to another. |
| Researcher Affiliation | Academia | Minjun Sung , Sambhu H. Karumanchi , Aditya Gahlawat, Naira Hovakimyan Department of Mechanical Science & Engineering University of Illinois Urbana-Champaign Urbana, IL, 61801, USA {mjsung2,shk9,gahlawat,nhovakim}@illinois.edu |
| Pseudocode | Yes | Algorithm 1: L1 ADAPTIVE CONTROL |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for their methodology is publicly available. |
| Open Datasets | Yes | In our first experimental study, we evaluate the proposed L1-MBRL framework on five different Open AI Gym environments (Brockman et al., 2016) with varying levels of state and action complexity. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts) for their experiments. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions various frameworks and algorithms such as Open AI Gym, METRPO, and MBMF, but does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | For the Inverted Pendulum environment, we set ϵ = 1 and for Halfcheetah ϵ = 3, while for other environments, we chose ϵ = 0.3. Additionally, we selected a cutoff frequency of ω = 0.35/Ts, where Ts represents the sampling time interval of the environment. |