Equivariant Networks for Zero-Shot Coordination

Authors: Darius Muglich, Christian Schroeder de Witt, Elise van der Pol, Shimon Whiteson, Jakob Foerster

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical guarantees of our work and test on the AI benchmark task of Hanabi, where we demonstrate our methods outperforming other symmetry-aware baselines in zero-shot coordination, as well as able to improve the coordination ability of a variety of pre-trained policies.
Researcher Affiliation Collaboration Darius Muglich University of Oxford dariusm1997@yahoo.com Christian Schroeder de Witt FLAIR, University of Oxford cs@robots.ox.ac.uk Elise van der Pol Microsoft Research AI4Science evanderpol@microsoft.com Shimon Whiteson University of Oxford shimon.whiteson@cs.ox.ac.uk Jakob Foerster FLAIR, University of Oxford jakob.foerster@eng.ox.ac.uk
Pseudocode No The paper describes an algorithmic approach in paragraph form but does not provide a formal pseudocode block or algorithm figure.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experi-mental results (either in the supplemental material or as a URL)? [Yes] Section 4, the Appendix, and the supplementary material
Open Datasets Yes The primary test bed for our methodology is the AI benchmark task Hanabi [5].
Dataset Splits No The paper describes training and evaluation on the Hanabi task (a reinforcement learning environment) and uses terms like 'self-play scores' and 'evaluated over 5000 games'. However, it does not specify explicit train/validation/test dataset splits with percentages or sample counts, which is more typical for supervised learning contexts.
Hardware Specification No The acknowledgements state: 'The experiments were made possible by a generous equipment grant from NVIDIA.' and the ethics statement confirms: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See the Appendix'. While NVIDIA is mentioned, specific GPU models or detailed hardware specifications are not provided in the main text.
Software Dependencies No The paper mentions several algorithms and frameworks like 'R2D2' [25], 'VDN' [43], 'SAD' [21], 'Adam' optimizer [26], and 'LSTM' [20]. However, it does not provide specific version numbers for any of these software components, libraries, or programming languages.
Experiment Setup No For further details of the training setup and hyperparameters used in the following experiments, please refer to the Appendix.