Learning Multi-agent Behaviors from Distributed and Streaming Demonstrations

Authors: Shicheng Liu, Minghui Zhu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section shows that Algorithm 1 is effective to both discrete and continuous environments. We use four centralized baselines for comparisons: (i) Behavior inference from centralized and streaming demonstrations (BICS): This is the centralized counterpart of MA-BIRDS where a central learner obtains all the demonstrations at each online iteration.
Researcher Affiliation Academia Shicheng Liu & Minghui Zhu School of Electrical Engineering and Computer Science Pennsylvania State University University Park, PA 16802, USA {sfl5539,muz16}@psu.edu
Pseudocode Yes Algorithm 1 Multi-agent behavior inference from distributed and streaming demonstrations
Open Source Code No The paper does not provide an explicit statement or link for the open-sourcing of the methodology's code.
Open Datasets No The paper describes experiments in an 'evader-patroller setting' and 'Drone motion planning with obstacles' (using a Gazebo simulator), noting that the latter was introduced in [9]. However, it does not provide concrete access information (link, DOI, repository, or formal citation for data access) for the specific demonstration data used for training/evaluation in this paper.
Dataset Splits No The paper describes an online learning setting with 'streaming demonstrations' where data is revealed sequentially. It does not mention traditional fixed training, validation, or test dataset splits or specific percentages/counts for reproduction.
Hardware Specification No The paper mentions 'actual time varies a lot on different hardwares' but does not provide any specific details about the CPU, GPU, memory, or other hardware used for running the experiments or simulations.
Software Dependencies No The paper mentions using a 'Gazebo simulator' but does not specify a version number. It also references methods like 'soft Q-learning [41]' and 'soft actor-critic [42]' but does not list any specific software libraries or their versions used for implementation (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup No The paper describes the general setup of the simulation environments and compares to baselines. While it discusses theoretical step sizes (α(n), β(n)), it does not provide concrete hyperparameter values (e.g., specific learning rates, batch sizes, network architectures, number of epochs) used in the experimental implementations.