Strategy-Based Warm Starting for Regret Minimization in Games
Authors: Noam Brown, Tuomas Sandholm
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that one can improve overall convergence in a game by first running CFR on a smaller, coarser abstraction of the game and then using the strategy in the abstract game to warm start CFR in the full game. We now present experimental results for our warm-starting algorithm. |
| Researcher Affiliation | Academia | Noam Brown Computer Science Department Carnegie Mellon University noamb@cs.cmu.edu Tuomas Sandholm Computer Science Department Carnegie Mellon University sandholm@cs.cmu.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. The methods are described narratively and mathematically, but not in a code-like format. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. There is no mention of code release, repository links, or code in supplementary materials. |
| Open Datasets | No | The paper mentions running experiments on 'random 100x100 normal-form games' and 'Flop Texas Hold em (FTH)'. However, it does not provide concrete access information (link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset for either of these. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It discusses running CFR for a certain number of iterations, but not data splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It mentions 'three core minutes' for an iteration of CFR, which refers to computation time, not hardware specifications. |
| Software Dependencies | No | The paper does not provide specific ancillary software details needed to replicate the experiment. It mentions algorithms like CFR and MCCFR and refers to 'implementation tricks', but does not list software names with version numbers (e.g., programming languages, libraries, solvers). |
| Experiment Setup | Yes | When resetting, we determine the number of iterations to warm start to based on an estimated function of the convergence rate of CFR in FTH, which is determined by the first 10 iterations of CFR. Our projection method estimated that after T iterations of CFR, σT is a 10.82 / T-equilibrium. Thus, when warm starting based on a strategy profile with exploitability x, we warm start to T = 10.82 / x. Figure 2 shows performance when warm starting at 100, 500, and 2500 iterations. These are three separate runs, where we warm start once on each run. We compare them to a run of CFR with no warm starting. Based on the average strategies when warm starting occurred, the runs were warm started to 97, 490, and 2310 iterations, respectively. The MCCFR run uses an abstraction with 5,000 buckets on the flop. After six core minutes of the MCCFR run, its average strategy was used to warm start CFR in the full to T = 70 using λ = 0.08. |