Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates

Authors: Zengyi Qin, Kaiqing Zhang, Yuxiao Chen, Jingkai Chen, Chuchu Fan

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide extensive experiments to demonstrate that our method significantly outperforms other leading multi-agent control approaches in terms of maintaining safety and completing original tasks.
Researcher Affiliation Academia 1Massachusetts Institute of Technology 2University of Illinois Urbana-Champaign 3California Institute of Technology
Pseudocode No The paper describes methods in text and uses a computational graph (Figure 1) but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Videos and source code can be found on the website1. 1https://realm.mit.edu/blog/learning-safe-multi-agent-control-decentralized-neural-barrier-certificates
Open Datasets No The paper describes simulated environments (e.g., 'multi-agent particle environment (Lowe et al., 2017)', 'Nested-Rings environment adopted from Rodr ıguez-Seda et al. (2014)') from which data is collected online during training, rather than using a pre-existing, publicly available dataset with concrete access information like a specific link or repository for the data itself.
Dataset Splits No The paper uses an on-policy training strategy where data is collected online by running the current system, and does not specify fixed training, validation, and test dataset splits with percentages or counts.
Hardware Specification No The paper states '1024 is not the limit of our approach but rather due to the limited computational capability of our laptop used for the experiments,' but does not provide specific hardware details like CPU/GPU models or memory.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We choose γ = 10 2 in implementation... ι is set to be 0.05 in our experiment... We minimize L by applying stochastic gradient descent with learning rate 10 3 and weight decay 10 6... The final loss function L = Lc + ηLg, where η is a balance weight that is set to 0.1 in our experiments.