Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability
Authors: Revan MacQueen, James Wright
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our findings through experiments on Leduc poker. |
| Researcher Affiliation | Academia | Revan Mac Queen Department of Computing Science University of Alberta / Amii; James R. Wright Department of Computing Science University of Alberta / Amii |
| Pseudocode | Yes | Algorithm 1 SGDecompose; Algorithm 2 Compute γ; Algorithm 3 SGDecompose with behavior strategies; Algorithm 4 get BRs |
| Open Source Code | Yes | The codebase for our experiments is available at https://github.com/Revan Mac Queen/ Self-Play-Polymatrix. |
| Open Datasets | Yes | Leduc poker was originally developed for two players but was extended to a 3-player variant by Abou Risk & Szafron (2010); we use the 3-player variant here. |
| Dataset Splits | No | The paper does not specify explicit train/validation/test dataset splits in the context of supervised learning, as it focuses on learning through self-play in games. |
| Hardware Specification | No | The paper mentions 'Computation for this work was provided by the Digital Research Alliance of Canada' but does not specify exact hardware details such as GPU/CPU models or memory amounts. |
| Software Dependencies | No | The paper mentions software components like 'CFR+' and 'Open Spiel implementation' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We used the same parameters for each run of SGDecompose: λ = 0.5, B = 30, T = 200. We used a learning rate schedule where the learning rate η begins at 2 −6, then halves every 5 epochs until reaching 2 −17 to encourage convergence. We randomly initialize CFR+ with regrets between 0 and 0.001 chosen uniformly at random, which are the default values in Open Spiel. |