Surprise Minimizing Multi-Agent Learning with Energy-based Models
Authors: Karush Suri, Xiao Qi Shi, Konstantinos N Plataniotis, Yuri Lawryshyn
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further validate our theoretical claims in an empirical study of multi-agent tasks demanding collaboration in the presence of fast-paced dynamics. Our implementation and agent videos are available at the Project Webpage. |
| Researcher Affiliation | Collaboration | Karush Suri1, Xiao Qi Shi2, Konstantinos Plataniotis1, Yuri Lawryshyn1 1 University of Toronto, 2 RBC Capital Markets |
| Pseudocode | Yes | Algorithm 1 Energy-based MIXer (EMIX) |
| Open Source Code | Yes | Our implementation and agent videos are available at the Project Webpage. |
| Open Datasets | Yes | We assess the validity of EMIX, when combined with QMIX, on multi-agent Star Craft II micromanagement scenarios [48] |
| Dataset Splits | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] |
| Hardware Specification | Yes | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] |
| Software Dependencies | No | The paper mentions software like QMIX, VDN, COMA, IQL, and SMi RL as methods being compared or used, but does not specify their version numbers to ensure reproducibility of the software environment. |
| Experiment Setup | Yes | Initialize learning rate α, temperature β and replay buffer R. ... The choice of β is further validated in Figure 5 wherein β = 0.01 provides consistent stable improvements over other values. |