Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability
Authors: Revan MacQueen, James Wright
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our findings through experiments on Leduc poker. |
| Researcher Affiliation | Academia | Revan Mac Queen Department of Computing Science University of Alberta / Amii; James R. Wright Department of Computing Science University of Alberta / Amii |
| Pseudocode | Yes | Algorithm 1 SGDecompose; Algorithm 2 Compute γ; Algorithm 3 SGDecompose with behavior strategies; Algorithm 4 get BRs |
| Open Source Code | Yes | The codebase for our experiments is available at https://github.com/Revan Mac Queen/ Self-Play-Polymatrix. |
| Open Datasets | Yes | Leduc poker was originally developed for two players but was extended to a 3-player variant by Abou Risk & Szafron (2010); we use the 3-player variant here. |
| Dataset Splits | No | The paper does not specify explicit train/validation/test dataset splits in the context of supervised learning, as it focuses on learning through self-play in games. |
| Hardware Specification | No | The paper mentions 'Computation for this work was provided by the Digital Research Alliance of Canada' but does not specify exact hardware details such as GPU/CPU models or memory amounts. |
| Software Dependencies | No | The paper mentions software components like 'CFR+' and 'Open Spiel implementation' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We used the same parameters for each run of SGDecompose: λ = 0.5, B = 30, T = 200. We used a learning rate schedule where the learning rate η begins at 2 −6, then halves every 5 epochs until reaching 2 −17 to encourage convergence. We randomly initialize CFR+ with regrets between 0 and 0.001 chosen uniformly at random, which are the default values in Open Spiel. |