Equivariant Networks for Zero-Shot Coordination
Authors: Darius Muglich, Christian Schroeder de Witt, Elise van der Pol, Shimon Whiteson, Jakob Foerster
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical guarantees of our work and test on the AI benchmark task of Hanabi, where we demonstrate our methods outperforming other symmetry-aware baselines in zero-shot coordination, as well as able to improve the coordination ability of a variety of pre-trained policies. |
| Researcher Affiliation | Collaboration | Darius Muglich University of Oxford dariusm1997@yahoo.com Christian Schroeder de Witt FLAIR, University of Oxford cs@robots.ox.ac.uk Elise van der Pol Microsoft Research AI4Science evanderpol@microsoft.com Shimon Whiteson University of Oxford shimon.whiteson@cs.ox.ac.uk Jakob Foerster FLAIR, University of Oxford jakob.foerster@eng.ox.ac.uk |
| Pseudocode | No | The paper describes an algorithmic approach in paragraph form but does not provide a formal pseudocode block or algorithm figure. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experi-mental results (either in the supplemental material or as a URL)? [Yes] Section 4, the Appendix, and the supplementary material |
| Open Datasets | Yes | The primary test bed for our methodology is the AI benchmark task Hanabi [5]. |
| Dataset Splits | No | The paper describes training and evaluation on the Hanabi task (a reinforcement learning environment) and uses terms like 'self-play scores' and 'evaluated over 5000 games'. However, it does not specify explicit train/validation/test dataset splits with percentages or sample counts, which is more typical for supervised learning contexts. |
| Hardware Specification | No | The acknowledgements state: 'The experiments were made possible by a generous equipment grant from NVIDIA.' and the ethics statement confirms: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See the Appendix'. While NVIDIA is mentioned, specific GPU models or detailed hardware specifications are not provided in the main text. |
| Software Dependencies | No | The paper mentions several algorithms and frameworks like 'R2D2' [25], 'VDN' [43], 'SAD' [21], 'Adam' optimizer [26], and 'LSTM' [20]. However, it does not provide specific version numbers for any of these software components, libraries, or programming languages. |
| Experiment Setup | No | For further details of the training setup and hyperparameters used in the following experiments, please refer to the Appendix. |