Depth-Limited Solving for Imperfect-Information Games

Authors: Noam Brown, Tuomas Sandholm, Brandon Amos

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted experiments on the games of heads-up no-limit Texas hold em poker (HUNL) and heads-up no-limit flop hold em poker (NLFH). ... Our main experiment uses depth-limited solving to produce a master-level HUNL poker AI called Modicum using computing resources found in a typical laptop. We test Modicum against Baby Tartanian8 [4], the winner of the 2016 Annual Computer Poker Competition, and against Slumbot [18], the winner of the 2018 Annual Computer Poker Competition. ... The performance of Modicum is shown in Table 1.
Researcher Affiliation Academia Noam Brown, Tuomas Sandholm, Brandon Amos Computer Science Department Carnegie Mellon University noamb@cs.cmu.edu, sandholm@cs.cmu.edu, bamos@cs.cmu.edu
Pseudocode No The paper does not contain any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that code is released or available.
Open Datasets Yes We conducted experiments on the games of heads-up no-limit Texas hold em poker (HUNL) and heads-up no-limit flop hold em poker (NLFH). HUNL is the main large-scale benchmark for imperfect-information game AIs.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) as the experiments are conducted within game environments rather than on traditional datasets with explicit splits.
Hardware Specification Yes Modicum used just 700 core hours and 16GB of RAM to compute its strategy and can play in real time at the speed of human professionals (an average of 20 seconds for an entire hand of poker) using just a 4-core CPU.
Software Dependencies No The paper mentions software like PyTorch [32] and Adam [21] but does not provide specific version numbers for these or any other ancillary software components.
Experiment Setup Yes The DNN was trained using 180 million examples per player by optimizing the Huber loss with Adam [21], which we implemented using Py Torch [32]. In order for the network to run sufficiently fast on just a 4-core CPU, the DNN has just 4 hidden layers with 256 nodes in the first hidden layer and 128 nodes in the remaining hidden layers. This achieved a Huber loss of 0.02.