XDO: A Double Oracle Algorithm for Extensive-Form Games
Authors: Stephen McAleer, JB Lanier, Kevin A Wang, Pierre Baldi, Roy Fox
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In tabular experiments on Leduc poker, we find that XDO achieves an approximate Nash equilibrium in a number of iterations an order of magnitude smaller than PSRO. Experiments on a modified Leduc poker game and Oshi-Zumo show that tabular XDO achieves a lower exploitability than CFR with the same amount of computation. We also find that NXDO outperforms PSRO and NFSP on a sequential multidimensional continuous-action game. |
| Researcher Affiliation | Academia | Stephen Mc Aleer Department of Computer Science University of California, Irvine smcaleer@uci.edu; John Lanier Department of Computer Science University of California, Irvine jblanier@uci.edu; Kevin A. Wang Department of Computer Science University of California, Irvine kevinwang@kevinwang.us; Pierre Baldi Department of Computer Science University of California, Irvine pfbaldi@ics.uci.edu; Roy Fox Department of Computer Science University of California, Irvine royf@uci.edu |
| Pseudocode | Yes | Algorithm 1 XDO; Algorithm 2 NXDO |
| Open Source Code | Yes | Experiment code is available at https://github.com/indylab/nxdo. |
| Open Datasets | No | The paper refers to established game environments like Leduc poker, Oshi-Zumo, and custom games like m-Clone Leduc and the Loss Game. While it mentions using Open Spiel [Lanctot et al., 2019] for some implementations, it does not provide concrete access information (specific link, DOI, or repository) for a publicly available 'dataset' of game states or play traces. |
| Dataset Splits | No | The paper describes experiments in various game environments but does not provide specific training/test/validation dataset splits (e.g., percentages or sample counts) for the data used in these experiments. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions several software components and algorithms used (e.g., CFR, XFP, PSRO, NFSP, PPO, DDQN, Open Spiel) but does not provide specific version numbers for any of them (e.g., 'PyTorch 1.9' or 'Open Spiel v0.1'). |
| Experiment Setup | No | The paper states, 'Training details and an analysis on the proportion of experience spent in the inner vs outer loop of NXDO are included in supplementary materials.' This indicates that the specific experimental setup details, such as hyperparameters or training configurations, are not present in the main text of the paper. |