reproducibilityindex.ai

Finding and Certifying (Near-)Optimal Strategies in Black-Box Extensive-Form Games

Authors: Brian Hu Zhang, Tuomas Sandholm5779-5788

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate experimentally that, in the black-box setting, our methods are able to provide nontrivial exploitability guarantees while expanding only a small fraction of the game tree. We conducted experiments on two common benchmarks: (1) k-rank Goofspiel. (2) k-rank heads-up limit Leduc poker (Southey et al. 2005).
Researcher Affiliation	Collaboration	1Computer Science Department, Carnegie Mellon University 2Strategic Machine, Inc. 3Strategy Robot, Inc. 4Optimized Markets, Inc.
Pseudocode	Yes	Algorithm 6.1 Two-player zero-sum certiﬁcate ﬁnding; Algorithm 7.2 Certiﬁcate-ﬁnding with regret minimization
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We conducted experiments on two common benchmarks: (1) k-rank Goofspiel. (2) k-rank heads-up limit Leduc poker (Southey et al. 2005)
Dataset Splits	No	The paper discusses 'playthroughs' and 'samples' from a black-box simulator but does not describe conventional training, validation, and test dataset splits with specific percentages or counts.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments, only mentioning the use of Gurobi for LP solving.
Software Dependencies	Yes	We use Gurobi v9.0.0 (Gurobi Optimization, LLC 2019) as the LP solver.
Experiment Setup	Yes	Since the LP solves are relatively expensive, we only recompute the LP solution every 100 playthroughs sampled. This does not change the asymptotic performance of the algorithm. To be consistent with the other algorithms, one iteration of MCCFR consists of one accepted loss vector per player. For the other algorithms, one iteration is one playthrough.