reproducibilityindex.ai

Solving Large Extensive-Form Games with Strategy Constraints

Authors: Trevor Davis, Kevin Waugh, Michael Bowling1861-1868

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experimental evaluation We present two domains for experimental evaluation in this paper. In the ﬁrst, we use constraints to model a secondary objective when generating strategies in a model security game. In the second domain, we use constraints for opponent modeling in a small poker game. We demonstrate that using constraints for modeling data allows us to learn counter-strategies that approach optimal counter-strategies as the amount of data increases.
Researcher Affiliation	Collaboration	Trevor Davis,1 Kevin Waugh,2 Michael Bowling2,1 1Department of Computing Science, University of Alberta 2Deep Mind
Pseudocode	Yes	Algorithm 1 Constrained CFR
Open Source Code	No	The transit game experiments were implemented with code made publically available by the game theory group of the Artiﬁcial Intelligence Center at Czech Technical University in Prague. This refers to code they used, not necessarily the code for their proposed CCFR algorithm itself. No explicit statement or link for their own code.
Open Datasets	Yes	We ran our experiments in Leduc Hold em (Southey et al. 2005), a small poker game played with a six card deck over two betting rounds.
Dataset Splits	No	The paper describes generating constraints from observed games and evaluating performance as the number of observed games increases, but it does not specify explicit training, validation, or test dataset splits, nor does it mention cross-validation.
Hardware Specification	No	The paper states 'Computing resources were provided by Compute Canada and Calcul Qu ebec.' but does not specify any particular hardware components such as GPU models, CPU models, or memory details used for the experiments.
Software Dependencies	Yes	by comparing its produced strategies with strategies produced by solving the LP representation of the game with the simplex solver in IBM ILOG CPLEX 12.7.1.
Experiment Setup	Yes	We update the CCFR constraint weights λ using stochastic gradient ascent with constant step size αt = 1, which we found to work well across a variety of game sizes and risk bounds.