Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Equivariant Networks for Zero-Shot Coordination

Authors: Darius Muglich, Christian Schroeder de Witt, Elise van der Pol, Shimon Whiteson, Jakob Foerster

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical guarantees of our work and test on the AI benchmark task of Hanabi, where we demonstrate our methods outperforming other symmetry-aware baselines in zero-shot coordination, as well as able to improve the coordination ability of a variety of pre-trained policies.
Researcher Affiliation Collaboration Darius Muglich University of Oxford EMAIL Christian Schroeder de Witt FLAIR, University of Oxford EMAIL Elise van der Pol Microsoft Research AI4Science EMAIL Shimon Whiteson University of Oxford EMAIL Jakob Foerster FLAIR, University of Oxford EMAIL
Pseudocode No The paper describes an algorithmic approach in paragraph form but does not provide a formal pseudocode block or algorithm figure.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experi-mental results (either in the supplemental material or as a URL)? [Yes] Section 4, the Appendix, and the supplementary material
Open Datasets Yes The primary test bed for our methodology is the AI benchmark task Hanabi [5].
Dataset Splits No The paper describes training and evaluation on the Hanabi task (a reinforcement learning environment) and uses terms like 'self-play scores' and 'evaluated over 5000 games'. However, it does not specify explicit train/validation/test dataset splits with percentages or sample counts, which is more typical for supervised learning contexts.
Hardware Specification No The acknowledgements state: 'The experiments were made possible by a generous equipment grant from NVIDIA.' and the ethics statement confirms: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See the Appendix'. While NVIDIA is mentioned, specific GPU models or detailed hardware specifications are not provided in the main text.
Software Dependencies No The paper mentions several algorithms and frameworks like 'R2D2' [25], 'VDN' [43], 'SAD' [21], 'Adam' optimizer [26], and 'LSTM' [20]. However, it does not provide specific version numbers for any of these software components, libraries, or programming languages.
Experiment Setup No For further details of the training setup and hyperparameters used in the following experiments, please refer to the Appendix.