Fast computation of Nash Equilibria in Imperfect Information Games
Authors: Remi Munos, Julien Perolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we evaluate MAIO-BR on 2 matrix games with both ℓ2 and entropy regularization. In the Appendix, Section P we report experiments of MAIO for IIG and compare to other approaches (CFR, CFR-BR, CFR+). Figure 1. We report ℓ2 distance to the Nash eq. (in log-scale) for MAIO-BR on the ε-matrix game (Fig. a,b,c) and the biased rockpaper-scissors game (Fig. d). |
| Researcher Affiliation | Industry | 1Deep Mind. Correspondence to: Remi Munos <munos@google.com>. |
| Pseudocode | No | The algorithm is described narratively and mathematically (e.g., 'Algorithm [MAIO for IIG]: For each player i {1, 2}, we start with a uniform policy πi,0(x) from all x Xi. At every iteration t 0, we compute an improved policy πt(x) over πt...') but no formal pseudocode or algorithm block is presented. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the described methodology or a link to a code repository. |
| Open Datasets | No | The paper describes experiments on '2 matrix games' which are defined by their payoff matrices directly within the text (e.g., 'R = 0 1 ε 2 1' and 'R = 0 1 0.1 1 0 0.1 0.1 0.1 0'), but does not provide concrete access information such as specific links, DOIs, repository names, or formal citations for these as publicly available external datasets. |
| Dataset Splits | No | The paper describes numerical experiments on matrix games but does not provide specific dataset split information (percentages, sample counts, or methodology) as it deals with game theory algorithms rather than traditional machine learning datasets with explicit train/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'the numpy package with double precision' but does not specify its version number or any other software dependencies with version details, which are required for full reproducibility. |
| Experiment Setup | Yes | We observe the exponential convergence with a rate that depends on ε (Fig. 1(a)) and the constant c (Fig. 1(b)) used in the learning rate (i.e., we chose ηt = c I( πt, πt)). The first game is defined by the matrix payoff: R = 0 1 ε 2 1 parameterized by some ε > 0. |