Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

Authors: Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr TImbers, Karl Tuyls

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate convergence rates comparable to XFP and CFR in four benchmark games in the tabular case. Using function approximation, we find that our algorithm outperforms the tabular version in two of the games, which, to the best of our knowledge, is the first such result in imperfect information games among this class of algorithms.
Researcher Affiliation Collaboration Edward Lockhart1 , Marc Lanctot1 , Julien P erolat1 , Jean-Baptiste Lespiau1 , Dustin Morrill1,2 , Finbarr Timbers1 , Karl Tuyls1 1Deep Mind 2University of Alberta, Edmonton, Canada
Pseudocode Yes Algorithm 1: Fictitious Play; Algorithm 2: Exploitability Descent (ED)
Open Source Code No The paper refers to a technical report [Lockhart et al., 2019] for proofs, but does not state that source code for the methodology is released or provide a specific link.
Open Datasets Yes Our experiments are run across four different imperfect information games. We provide very brief descriptions here; see Appendix ?? as well as [Kuhn, 1950; Southey et al., 2005] and [Lanctot, 2013, Chapter 3] for more detail. Kuhn poker is a simplified poker game first proposed by Harold Kuhn [Kuhn, 1950] Leduc poker is significantly larger game with two rounds and a 6-card deck in two suits, e.g. {JS,QS,KS, JH,QH,KH}. Liar s Dice(1,1) is dice game where each player gets a single private die, rolled at the start of the game, and players proceed to bid on the outcomes of all dice in the game. Goofspiel is a card game where players try to obtain point cards by bidding simultaneously.
Dataset Splits No The paper describes experiments in a game-theoretic setting, focusing on convergence over iterations rather than explicit dataset splits. There is no mention of training/validation/test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We performed a sweep over the number of hidden layers (from 1 to 5), the number of hidden units (64, 128 or 256), the regularization weight (10 7, 10 6, 10 5, 10 4), and the initial learning rate (powers of 2).