reproducibilityindex.ai

Multi-Modal Inverse Constrained Reinforcement Learning from a Mixture of Demonstrations

Authors: Guanren Qiao, Guiliang Liu, Pascal Poupart, Zhiqiang Xu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments in both discrete and continuous environments show that MMICRL outperforms other baselines in terms of constraint recovery and control performance.
Researcher Affiliation	Academia	Guanren Qiao1, Guiliang Liu 1, Pascal Poupart2,3, Zhiqiang Xu4 1School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2University of Waterloo, 3Vector Institute, 4Mohamed bin Zayed University of Artificial Intelligence
Pseudocode	Yes	Algorithm 1: Probing_Sets [...] Algorithm 2: Multi-Modal Inverse Constrained Reinforcement Learning (MMICRL)
Open Source Code	Yes	Our implementation is available at: https://github.com/qiaoguanren/Multi-Modal-Inverse-Constrained Reinforcement-Learning.
Open Datasets	Yes	The demonstration data are assumed to have a zero violation rate. We create a 7x7 Gridworld map and design four distinct constraint map settings. [...] To facilitate constraint inference, a demonstration dataset containing expert trajectories is provided for each environment [7]. [...] For extracting features of cars and roads, we use the features collector from Commonroad RL [39]. The constraint that we are interested in are 1) Car distance 20m (agent 0) and 2) Car distance 40m (agent 1). Figure 6 shows the distribution of car distance in expert demonstrations for agent 0 and agent 1.
Dataset Splits	No	The paper mentions 'demonstration data' and 'expert trajectories' but does not specify any training, validation, or test dataset splits or percentages.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions software like 'Mu Jo Co' and 'Commonroad RL' but does not provide specific version numbers for these or other ancillary software components, such as programming languages or libraries.
Experiment Setup	No	Appendix A.2 reports the detailed settings. [The main text refers to the Appendix for detailed settings, but the Appendix is not provided, thus specific experimental setup details like hyperparameters are not in the main text.]