Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability

Authors: Revan MacQueen, James Wright

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our findings through experiments on Leduc poker.
Researcher Affiliation Academia Revan Mac Queen Department of Computing Science University of Alberta / Amii; James R. Wright Department of Computing Science University of Alberta / Amii
Pseudocode Yes Algorithm 1 SGDecompose; Algorithm 2 Compute γ; Algorithm 3 SGDecompose with behavior strategies; Algorithm 4 get BRs
Open Source Code Yes The codebase for our experiments is available at https://github.com/Revan Mac Queen/ Self-Play-Polymatrix.
Open Datasets Yes Leduc poker was originally developed for two players but was extended to a 3-player variant by Abou Risk & Szafron (2010); we use the 3-player variant here.
Dataset Splits No The paper does not specify explicit train/validation/test dataset splits in the context of supervised learning, as it focuses on learning through self-play in games.
Hardware Specification No The paper mentions 'Computation for this work was provided by the Digital Research Alliance of Canada' but does not specify exact hardware details such as GPU/CPU models or memory amounts.
Software Dependencies No The paper mentions software components like 'CFR+' and 'Open Spiel implementation' but does not provide specific version numbers for these dependencies.
Experiment Setup Yes We used the same parameters for each run of SGDecompose: λ = 0.5, B = 30, T = 200. We used a learning rate schedule where the learning rate η begins at 2 −6, then halves every 5 epochs until reaching 2 −17 to encourage convergence. We randomly initialize CFR+ with regrets between 0 and 0.001 chosen uniformly at random, which are the default values in Open Spiel.