Multi-Agent Generative Adversarial Imitation Learning

Authors: Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We evaluate the performance of (centralized, decentralized, and zero-sum versions) of MAGAIL under two types of environments. One is a particle environment which allows for complex interactions and behaviors; the other is a control task, where multiple agents try to cooperate and move a plank forward. We collect results by averaging over 5 random seeds. Our implementation is based on Open AI baselines [33]; please refer to Appendix C for implementation details3.
Researcher Affiliation Academia Jiaming Song Stanford University tsong@cs.stanford.edu Hongyu Ren Stanford University hyren@cs.stanford.edu Dorsa Sadigh Stanford University dorsa@cs.stanford.edu Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode Yes We outline the algorithm Multi-Agent GAIL (MAGAIL) in Appendix B.
Open Source Code Yes 3Code for reproducing the experiments are in https://github.com/ermongroup/multiagent-gail.
Open Datasets Yes We first consider the particle environment proposed in [14], which consists of several agents and landmarks.
Dataset Splits No The paper mentions using "100 to 400 episodes of expert demonstrations, each with 50 timesteps" but does not provide specific train/validation/test dataset splits or cross-validation details for these demonstrations.
Hardware Specification No The paper does not provide specific details regarding the hardware used for running the experiments, such as GPU models, CPU models, or cloud computing specifications.
Software Dependencies No Our implementation is based on Open AI baselines [33]; please refer to Appendix C for implementation details3." (The reference [33] is 'Openai baselines. https://github.com/openai/baselines, 2017.') This mentions a software library but does not provide specific version numbers for it or other key dependencies.
Experiment Setup Yes We collect results by averaging over 5 random seeds." and "We consider 100 to 400 episodes of expert demonstrations, each with 50 timesteps, which is close to the amount of timesteps used for the control tasks in [16]." and "Following [34], we pretrain our Multi-Agent GAIL methods and the GAIL baseline using behavior cloning as initialization to reduce sample complexity for exploration.