reproducibilityindex.ai

Robust Model Based Reinforcement Learning Using $\mathcal{L}_1$ Adaptive Control

Authors: Minjun Sung, Sambhu Harimanas Karumanchi, Aditya Gahlawat, Naira Hovakimyan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness of the L1-MBRL scheme, we conduct extensive numerical simulations using two baseline MBRL algorithms across multiple environments, including scenarios with action or observation noise. The results unequivocally demonstrate that the L1-MBRL scheme enhances the performance of the underlying MBRL algorithms without any redesign or retuning of the L1 controller from one scenario to another.
Researcher Affiliation	Academia	Minjun Sung , Sambhu H. Karumanchi , Aditya Gahlawat, Naira Hovakimyan Department of Mechanical Science & Engineering University of Illinois Urbana-Champaign Urbana, IL, 61801, USA {mjsung2,shk9,gahlawat,nhovakim}@illinois.edu
Pseudocode	Yes	Algorithm 1: L1 ADAPTIVE CONTROL
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	In our first experimental study, we evaluate the proposed L1-MBRL framework on five different Open AI Gym environments (Brockman et al., 2016) with varying levels of state and action complexity.
Dataset Splits	No	The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts) for their experiments.
Hardware Specification	No	The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions various frameworks and algorithms such as Open AI Gym, METRPO, and MBMF, but does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup	Yes	For the Inverted Pendulum environment, we set ϵ = 1 and for Halfcheetah ϵ = 3, while for other environments, we chose ϵ = 0.3. Additionally, we selected a cutoff frequency of ω = 0.35/Ts, where Ts represents the sampling time interval of the environment.