Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

Authors: Joel Z Leibo, Edgar A DueƱez-Guzman, Alexander Vezhnevets, John P Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charlie Beattie, Igor Mordatch, Thore Graepel

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.
Researcher Affiliation Industry 1Deep Mind 2Google Brain.
Pseudocode No The paper describes methods in prose and with diagrams (e.g., Fig. 3 for process steps), but it does not include any formally labeled pseudocode blocks or algorithms.
Open Source Code No The paper states 'Since Melting Pot will be openly released, it can be extended by any interested researchers.', which indicates future availability, not current concrete access to the source code.
Open Datasets No The paper states 'Since Melting Pot will be openly released, it can be extended by any interested researchers.', indicating future availability of the environments/scenarios used as the dataset, but not current concrete access.
Dataset Splits No The paper describes training and testing phases but does not explicitly mention or detail a separate validation dataset or how data was split for validation purposes.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or processor types) used for running the experiments.
Software Dependencies No The paper mentions software components and architectures like A3C, V-MPO, and OPRE, but it does not provide specific version numbers for any libraries, frameworks, or programming languages used (e.g., PyTorch 1.9, Python 3.8).
Experiment Setup Yes Each agent was trained for 10^9 steps. At test time, we set the focal population to be the uniform distribution over the N agents. All agent architectures had the same size convolutional net and LSTM.