Solving Games with Functional Regret Estimation
Authors: Kevin Waugh, Dustin Morrill, James Bagnell, Michael Bowling
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources. In order to illustrate the practicality of RCFR, we test its performance in Leduc Hold em, a simpliļ¬ed poker game. Our goal is to compare the strategies found by RCFR with varying regressors to strategies generated with conventional abstraction techniques. In addition, we examine the iterative behaviour of RCFR compared to CFR. |
| Researcher Affiliation | Academia | Kevin Waugh waugh@cs.cmu.edu Dustin Morrill morrill@ualberta.ca J. Andrew Bagnell dbagnell@ri.cmu.edu Michael Bowling mbowling@ualberta.ca School of Computer Science Carnegie Mellon University 5000 Forbes Ave Pittsburgh, PA 15213 USA Department of Computing Science 2-21 Athabasca Hall University of Alberta Edmonton, AB T6G 2E8 Canada |
| Pseudocode | Yes | Algorithm 1 Regret-matching with Regret Regression |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper describes the game Leduc Hold em as a testbed and mentions running simulations ('was run for 100000 iterations') but does not refer to a publicly available dataset with a specific link, DOI, repository name, or formal citation. |
| Dataset Splits | No | The paper describes an online learning process in a simulated game environment with self-play, not a setup involving explicit training, validation, and test dataset splits for a fixed dataset. |
| Hardware Specification | No | The paper mentions 'Compute Canada for computational resources' but does not specify any exact GPU/CPU models, processor types, or memory details used for running experiments. |
| Software Dependencies | No | The paper describes algorithms and methods (e.g., 'regression tree', 'CFR') but does not list specific software libraries or tools with version numbers used for implementation or experimentation. |
| Experiment Setup | Yes | We use a regression tree aiming to minimize mean-squared error as our regressor. When training, we examine all candidate splits on a single feature and choose the one that results in the best immediate error reduction. The data is then partitioned according to this split and we recursively train both sets. If the error improvement at a node is less than a threshold, or no improvement can be made by any split, a leaf is inserted that predicts the average. It is this error threshold that we manipulate to control the complexity of the regressor the size of the tree. All the training data is kept between iterations, as in Algorithm 1. Eight features were chosen... chance sampling CFR (Zinkevich et al. 2007) was run for 100000 iterations to solve each abstract game. ... RCFR and CFR were run for 100000 iterations to generate the set of RCFR strategies and a FULL strategy, respectively. |