Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent
Authors: Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr TImbers, Karl Tuyls
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate convergence rates comparable to XFP and CFR in four benchmark games in the tabular case. Using function approximation, we find that our algorithm outperforms the tabular version in two of the games, which, to the best of our knowledge, is the first such result in imperfect information games among this class of algorithms. |
| Researcher Affiliation | Collaboration | Edward Lockhart1 , Marc Lanctot1 , Julien P erolat1 , Jean-Baptiste Lespiau1 , Dustin Morrill1,2 , Finbarr Timbers1 , Karl Tuyls1 1Deep Mind 2University of Alberta, Edmonton, Canada |
| Pseudocode | Yes | Algorithm 1: Fictitious Play; Algorithm 2: Exploitability Descent (ED) |
| Open Source Code | No | The paper refers to a technical report [Lockhart et al., 2019] for proofs, but does not state that source code for the methodology is released or provide a specific link. |
| Open Datasets | Yes | Our experiments are run across four different imperfect information games. We provide very brief descriptions here; see Appendix ?? as well as [Kuhn, 1950; Southey et al., 2005] and [Lanctot, 2013, Chapter 3] for more detail. Kuhn poker is a simplified poker game first proposed by Harold Kuhn [Kuhn, 1950] Leduc poker is significantly larger game with two rounds and a 6-card deck in two suits, e.g. {JS,QS,KS, JH,QH,KH}. Liar s Dice(1,1) is dice game where each player gets a single private die, rolled at the start of the game, and players proceed to bid on the outcomes of all dice in the game. Goofspiel is a card game where players try to obtain point cards by bidding simultaneously. |
| Dataset Splits | No | The paper describes experiments in a game-theoretic setting, focusing on convergence over iterations rather than explicit dataset splits. There is no mention of training/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We performed a sweep over the number of hidden layers (from 1 to 5), the number of hidden units (64, 128 or 256), the regularization weight (10 7, 10 6, 10 5, 10 4), and the initial learning rate (powers of 2). |