Adapting to game trees in zero-sum imperfect information games

Authors: Côme Fiegel, Pierre Menard, Tadashi Kozuno, Remi Munos, Vianney Perchet, Michal Valko

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide experiments that compare the performances of these two algorithms with IXOMD and Balanced OMD. We observe that, despite having different theoretical guarantees, the algorithms all seem to have comparable performances in practice. All rates are summarized in Table 1. ... In this section we provide preliminary experiments for Balanced FTRL and Adaptive FTRL on simple games. The code used to generate these experiments is available at https://github.com/anon17893/IIG-tree-adaptation.
Researcher Affiliation Collaboration 1CREST, ENSAE, IP Paris, Palaiseau, France 2ENS Lyon, Lyon, France 3Omron Sinic X, Tokyo, Japan 4Deepmind, Paris, France 5CRITEO AI Lab, Paris, France.
Pseudocode Yes Algorithm 1 Balanced FTRL-Tsallis/Shannon ... Algorithm 2 Adaptive FTRL
Open Source Code Yes The code used to generate these experiments is available at https://github.com/anon17893/IIG-tree-adaptation.
Open Datasets Yes Games We compare the algorithms on the following three standard benchmark tabular games: Kuhn poker (Kuhn, 1950), Leduc poker (Southey et al., 2005) and liars dice. We use the implementation of these games available in the Open Spiel library (Lanctot et al., 2019).
Dataset Splits No The paper uses a self-play learning setup for game-playing agents, which does not involve traditional train/validation/test dataset splits. Performance is evaluated based on exploitability over episodes, not on static data partitions.
Hardware Specification No The paper does not mention any specific hardware (e.g., GPU models, CPU types, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions using the "Open Spiel library (Lanctot et al., 2019)" but does not specify a version number for this library or any other software dependencies.
Experiment Setup Yes We thus tune the rates separately for each algorithm and each game, using a (logarithmic) grid search on the global learning rates, while the base IX parameter was taken as 1/20 of this global learning rate.