Learning Multi-Agent Communication with Contrastive Learning
Authors: Yat Long Lo, Biswa Sengupta, Jakob Nicolaus Foerster, Michael Noukhovitch
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In communication-essential environments, our method outperforms previous work in both performance and learning speed. Using qualitative metrics and representation probing, we show that our method induces more symmetric communication and captures global state information from the environment. |
| Researcher Affiliation | Collaboration | Yat Long Lo Dyson Robot Learning Lab richie.lo@dyson.com Biswa Sengupta Imperial College London biswasengupta@gmail.com Jakob Foerster FLAIR, University of Oxford jakob.foerster@eng.ox.ac.uk Michael Noukhovitch Mila Université de Montréal mnoukhov@gmail.com |
| Pseudocode | No | The paper describes the architecture and mathematical formulations but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | No | The paper describes using three multi-agent environments (Traffic-Junction, Predator-Prey, Find-Goal) for data generation but does not provide concrete access information (link, DOI, repository) for pre-existing datasets. It references implementations of these environments from other papers (Koul (2019), Lin et al. (2021), Singh et al. (2018)), implying they are known environments, but not providing direct access to a specific dataset for training. |
| Dataset Splits | No | The paper describes training for a certain number of environment steps and then evaluating over episodes, but it does not specify a distinct validation set or explicit training/validation/test splits with percentages, counts, or predefined partition methods for the data used. |
| Hardware Specification | No | The paper states that "All models are trained with the Adam optimizer" but provides no specific details regarding the hardware used for training, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions using the "Adam optimizer (Kingma & Ba, 2014)" but does not specify version numbers for any programming languages, libraries, or other software components used in the implementation (e.g., Python, PyTorch/TensorFlow, CUDA versions). |
| Experiment Setup | Yes | The architecture is shown in Figure 8 and hyperparameters are further described, both in Appendix A.3. [...] Table 4 lists out the hyperparameters used for all the methods. |