Robust Inverse Constrained Reinforcement Learning under Model Misspecification

Authors: Sheng Xu, Guiliang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically evaluate the efficacy of the proposed AR-ICRL algorithm in both discrete and continuous environments under transition dynamics mismatch. and Table 1 shows the evaluation results with large-scale noises.
Researcher Affiliation Academia 1 School of Data Science, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P.R. China . Correspondence to: Guiliang Liu <liuguiliang@cuhk.edu.cn>.
Pseudocode Yes Algorithm 1 Adaptively Robust Inverse Constrained Reinforcement Learning (AR-ICRL) and Algorithm 2 Safety-Robust Policy Iteration and Algorithm 3 Safety-Robust Proximal Policy Optimization
Open Source Code Yes The code is available at https: //github.com/Jasonxu1225/AR-ICRL.
Open Datasets Yes Based on the ICRL benchmark (Liu et al., 2023), we conduct experiments on three continuous robot control tasks with predefined constraints, including Blocked Half-Cheetah, Blocked Ant, and Crippled Walker.
Dataset Splits No The paper discusses training and testing in different environments but does not provide specific training/validation/test dataset splits or their percentages/counts for data within those environments.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory used for running its experiments.
Software Dependencies No The paper refers to algorithms and methods like PPO but does not provide specific version numbers for software dependencies or libraries used in implementation.
Experiment Setup Yes Table 2. List of the utilized hyperparameters in this work. To ensure equitable comparisons, we maintain consistency in the parameters of the same neural networks across different models. and lists specific hyperparameter values in the table.