reproducibilityindex.ai

Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning

Authors: Matthias Gerstgrasser, Tom Danino, Sarah Keren

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms. Further, sharing only a small number of highly relevant experiences outperforms sharing all experiences between agents, and the performance uplift from selective experience sharing is robust across a range of hyperparameters and DQN variants. ... We evaluate the SUPER approach on a number of multiagent benchmark domains. ... Figure 1: Performance of SUPER-dueling-DDQN variants with target bandwidth 0.1 on all domains.
Researcher Affiliation	Academia	Matthias Gerstgrasser School of Engineering and Applied Sciences Harvard University Computer Science Department Stanford University matthias@seas.harvard.edu Tom Danino Sarah Keren The Taub Faculty of Computer Science Technion Israel Institute of Technology tom.danino@campus.technion.ac.il sarahk@cs.technion.ac.il
Pseudocode	Yes	Algorithm 1 SUPER algorithm for DQN for each training iteration do Collect a batch of experiences b {DQN} for each agent i do Insert bi into bufferi {DQN} end for for each agent i do Select b i bi of experiences to share1 {SUPER} for each agent j = i do Insert b i into bufferj {SUPER} end for end for for each agent i do Sample a train batch bi from bufferi {DQN} Learn on train batch bi {DQN} end for end for 1 See section Experience Selection
Open Source Code	No	All source code is included in the appendix and will be made available on publication under an open-source license. We refer the reader to the included README file, which contains instructions to recreate the experiments discussed in this paper.
Open Datasets	Yes	We therefore run our experiments on several domains that are part of well-established benchmark packages. These include three domains from the Petting Zoo package [38], three domains from the Melting Pot package [18], and a two-player variant of the Atari 2600 game Space Invaders.
Dataset Splits	No	No explicit mention of validation dataset splits was found. The paper discusses train and test datasets, and hyperparameter tuning, but not a separate validation split.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) were explicitly mentioned for running the experiments.
Software Dependencies	Yes	We performed all experiments using the open-source library RLlib [20]. Experiments in Figure 1 and 6 were ran using RLlib version 2.0.0; experiments in other figures were run using version 1.13.0.
Experiment Setup	Yes	Table 2: Hyperparameter Configuration Table SISL: Pursuit Environment Parameters ... Table 3: Hyperparameter Configuration Table MAgent: Battle Environment Parameters ... Table 4: Hyperparameter Configuration Table MAgent: Adversarial Pursuit Environment Parameters ... Table 5: Hyperparameter Configuration Table Melting Pot Policy Network ... Table 6: Hyperparameter Configuration Table Atari Policy Network