Fast computation of Nash Equilibria in Imperfect Information Games

Authors: Remi Munos, Julien Perolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we evaluate MAIO-BR on 2 matrix games with both ℓ2 and entropy regularization. In the Appendix, Section P we report experiments of MAIO for IIG and compare to other approaches (CFR, CFR-BR, CFR+). Figure 1. We report ℓ2 distance to the Nash eq. (in log-scale) for MAIO-BR on the ε-matrix game (Fig. a,b,c) and the biased rockpaper-scissors game (Fig. d).
Researcher Affiliation Industry 1Deep Mind. Correspondence to: Remi Munos <munos@google.com>.
Pseudocode No The algorithm is described narratively and mathematically (e.g., 'Algorithm [MAIO for IIG]: For each player i {1, 2}, we start with a uniform policy πi,0(x) from all x Xi. At every iteration t 0, we compute an improved policy πt(x) over πt...') but no formal pseudocode or algorithm block is presented.
Open Source Code No The paper does not contain any explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets No The paper describes experiments on '2 matrix games' which are defined by their payoff matrices directly within the text (e.g., 'R = 0 1 ε 2 1' and 'R = 0 1 0.1 1 0 0.1 0.1 0.1 0'), but does not provide concrete access information such as specific links, DOIs, repository names, or formal citations for these as publicly available external datasets.
Dataset Splits No The paper describes numerical experiments on matrix games but does not provide specific dataset split information (percentages, sample counts, or methodology) as it deals with game theory algorithms rather than traditional machine learning datasets with explicit train/validation/test splits.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using 'the numpy package with double precision' but does not specify its version number or any other software dependencies with version details, which are required for full reproducibility.
Experiment Setup Yes We observe the exponential convergence with a rate that depends on ε (Fig. 1(a)) and the constant c (Fig. 1(b)) used in the learning rate (i.e., we chose ηt = c I( πt, πt)). The first game is defined by the matrix payoff: R = 0 1 ε 2 1 parameterized by some ε > 0.