Depth-Limited Solving for Imperfect-Information Games
Authors: Noam Brown, Tuomas Sandholm, Brandon Amos
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted experiments on the games of heads-up no-limit Texas hold em poker (HUNL) and heads-up no-limit flop hold em poker (NLFH). ... Our main experiment uses depth-limited solving to produce a master-level HUNL poker AI called Modicum using computing resources found in a typical laptop. We test Modicum against Baby Tartanian8 [4], the winner of the 2016 Annual Computer Poker Competition, and against Slumbot [18], the winner of the 2018 Annual Computer Poker Competition. ... The performance of Modicum is shown in Table 1. |
| Researcher Affiliation | Academia | Noam Brown, Tuomas Sandholm, Brandon Amos Computer Science Department Carnegie Mellon University noamb@cs.cmu.edu, sandholm@cs.cmu.edu, bamos@cs.cmu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that code is released or available. |
| Open Datasets | Yes | We conducted experiments on the games of heads-up no-limit Texas hold em poker (HUNL) and heads-up no-limit flop hold em poker (NLFH). HUNL is the main large-scale benchmark for imperfect-information game AIs. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) as the experiments are conducted within game environments rather than on traditional datasets with explicit splits. |
| Hardware Specification | Yes | Modicum used just 700 core hours and 16GB of RAM to compute its strategy and can play in real time at the speed of human professionals (an average of 20 seconds for an entire hand of poker) using just a 4-core CPU. |
| Software Dependencies | No | The paper mentions software like PyTorch [32] and Adam [21] but does not provide specific version numbers for these or any other ancillary software components. |
| Experiment Setup | Yes | The DNN was trained using 180 million examples per player by optimizing the Huber loss with Adam [21], which we implemented using Py Torch [32]. In order for the network to run sufficiently fast on just a 4-core CPU, the DNN has just 4 hidden layers with 256 nodes in the first hidden layer and 128 nodes in the remaining hidden layers. This achieved a Huber loss of 0.02. |