Distributed Inverse Constrained Reinforcement Learning for Multi-agent Systems
Authors: Shicheng Liu, Minghui Zhu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations are done to validate the proposed algorithm. (Abstract); This section presents two simulation examples. (Section 6) |
| Researcher Affiliation | Academia | School of Electrical Engineering and Computer Science Pennsylvania State University University Park, PA 16802, USA {sfl5539,muz16}@psu.edu |
| Pseudocode | Yes | Algorithm 1 MEML D-ICRL; Algorithm 2 Inner process |
| Open Source Code | Yes | We also include the code in the supplementary materials. |
| Open Datasets | No | The first example uses a grid world introduced in [11], but the demonstration data is generated by the authors. For the second example, the authors state “We first control the simulated drones to their target doors, record nine pairs of trajectories, and distribute four and five pairs to two learners respectively,” indicating a custom dataset with no explicit public access. |
| Dataset Splits | No | The paper mentions distributing “10, 20, 30, 40 demonstrated trajectories” to learners and a “total 100 demonstrated trajectories” for baselines, but does not specify explicit train/validation/test dataset splits with percentages, sample counts, or citations to predefined splits. |
| Hardware Specification | No | The main paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments, although it mentions in the ethics review that “The compute type is included in the Appendix.” |
| Software Dependencies | No | The paper mentions using “Gazebo” for simulation but does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiment. |
| Experiment Setup | No | The paper states that “The detailed simulation setup is included in the Appendix” and “We include some details in Section 6 and the rest details are included in the Appendix,” but the main text itself does not provide specific experimental setup details such as hyperparameter values or training configurations. |