Surprise Minimizing Multi-Agent Learning with Energy-based Models

Authors: Karush Suri, Xiao Qi Shi, Konstantinos N Plataniotis, Yuri Lawryshyn

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further validate our theoretical claims in an empirical study of multi-agent tasks demanding collaboration in the presence of fast-paced dynamics. Our implementation and agent videos are available at the Project Webpage.
Researcher Affiliation Collaboration Karush Suri1, Xiao Qi Shi2, Konstantinos Plataniotis1, Yuri Lawryshyn1 1 University of Toronto, 2 RBC Capital Markets
Pseudocode Yes Algorithm 1 Energy-based MIXer (EMIX)
Open Source Code Yes Our implementation and agent videos are available at the Project Webpage.
Open Datasets Yes We assess the validity of EMIX, when combined with QMIX, on multi-agent Star Craft II micromanagement scenarios [48]
Dataset Splits Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]
Hardware Specification Yes Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]
Software Dependencies No The paper mentions software like QMIX, VDN, COMA, IQL, and SMi RL as methods being compared or used, but does not specify their version numbers to ensure reproducibility of the software environment.
Experiment Setup Yes Initialize learning rate α, temperature β and replay buffer R. ... The choice of β is further validated in Figure 5 wherein β = 0.01 provides consistent stable improvements over other values.