Using Response Functions to Measure Strategy Strength
Authors: Trevor Davis, Neil Burch, Michael Bowling
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of this technique in Leduc Hold em against opponents that use the UCT Monte Carlo tree search algorithm. |
| Researcher Affiliation | Academia | Trevor Davis and Neil Burch and Michael Bowling {trdavis1,burch,mbowling}@ualberta.ca Department of Computing Science University of Alberta Edmonton, AB, Canada T6G 2EG |
| Pseudocode | No | The paper describes algorithms like CFR-f and UCT, but it does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a link or explicit statement of release) to open-source code for the described methodology. |
| Open Datasets | No | The paper describes using the game 'Leduc Hold em' as a domain for experiments and references a paper for its full details, but it does not provide concrete access information (link, DOI, repository, or explicit statement of public availability) for a dataset used for training. |
| Dataset Splits | No | The paper does not specify dataset splits (e.g., percentages, counts) for training, validation, or testing. It mentions averaging results over '100 independent runs of UCT'. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions algorithms like UCT and CFR, but it does not provide specific version numbers for any software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | For each iteration t of CFR-UCT, we ran a CFR update for the CFR-agent to create strategy σt 1, then we used UCT to train a response to σt 1. On each iteration the UCT-agent created an entirely new game tree, so the response depended only on σt 1. We gave the UCT-agent k iterations of the UCT algorithm, each of which correspond to one sample of σt 1, where k is a parameter of CFR-UCT. |