Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Robust Inverse Constrained Reinforcement Learning under Model Misspecification
Authors: Sheng Xu, Guiliang Liu
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically evaluate the efficacy of the proposed AR-ICRL algorithm in both discrete and continuous environments under transition dynamics mismatch. and Table 1 shows the evaluation results with large-scale noises. |
| Researcher Affiliation | Academia | 1 School of Data Science, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P.R. China . Correspondence to: Guiliang Liu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Adaptively Robust Inverse Constrained Reinforcement Learning (AR-ICRL) and Algorithm 2 Safety-Robust Policy Iteration and Algorithm 3 Safety-Robust Proximal Policy Optimization |
| Open Source Code | Yes | The code is available at https: //github.com/Jasonxu1225/AR-ICRL. |
| Open Datasets | Yes | Based on the ICRL benchmark (Liu et al., 2023), we conduct experiments on three continuous robot control tasks with predefined constraints, including Blocked Half-Cheetah, Blocked Ant, and Crippled Walker. |
| Dataset Splits | No | The paper discusses training and testing in different environments but does not provide specific training/validation/test dataset splits or their percentages/counts for data within those environments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper refers to algorithms and methods like PPO but does not provide specific version numbers for software dependencies or libraries used in implementation. |
| Experiment Setup | Yes | Table 2. List of the utilized hyperparameters in this work. To ensure equitable comparisons, we maintain consistency in the parameters of the same neural networks across different models. and lists specific hyperparameter values in the table. |