Sparsified Linear Programming for Zero-Sum Equilibrium Finding
Authors: Brian Zhang, Tuomas Sandholm
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With experiments on poker endgames, we demonstrate in practice, for the first time, that modern linear program solvers are competitive against even game-specific modern variants of CFR in solving large extensive-form games, and can be used to compute exact solutions unlike iterative algorithms like CFR. |
| Researcher Affiliation | Collaboration | Brian Hu Zhang 1 Tuomas Sandholm 1 2 3 4 1Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 2Strategic Machine, Inc. 3Strategy Robot, Inc. 4Optimized Markets, Inc. |
| Pseudocode | Yes | Algorithm 2 Matrix factorization |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | We tested on the following benchmark games from the literature: Leduc poker (Southey et al., 2005) is a small variant of poker, played with one hole card and three community cards. Battleship (Farina et al., 2019) is the classic targeting game... Sheriff (Farina et al., 2019) is a simplified Sheriff of Nottingham game... No-limit hold-em (NLH) river endgames are endgames encountered by the poker-playing agent Libratus (Brown & Sandholm, 2017)... |
| Dataset Splits | No | The paper analyzes equilibrium finding in games and measures 'Nash gap' for accuracy, but it does not describe explicit training, validation, and test dataset splits as typically found in machine learning experiments. |
| Hardware Specification | No | The paper states that 'All algorithms were again restricted to a single core' but does not specify any particular CPU model, GPU, or other hardware components used for the experiments. |
| Software Dependencies | Yes | We compared state-of-the-art commercial implementations (Gurobi Optimization, 2019) of the common LP solving algorithms (simplex, dual simplex, and barrier) and our modified version of the O(log2(1/ε)) LPsparse algorithm (Yen et al., 2015)... In this experiment we used an optimized poker-specific C++ implementation of DCFR. |
| Experiment Setup | Yes | We ran LPsparse and CFR to target precision (Nash gap) 10 4, or for 2 hours, whichever threshold was hit first... The timeout was set to 3600 seconds (1 hour)... To align with Brown & Sandholm (2019), we used a simple action abstraction: the bets are half-pot, full-pot, and all-in, and the raises are full-pot and all-in. |