Meta-Learning in Games

Authors: Keegan Harris, Ioannis Anagnostides, Gabriele Farina, Mikhail Khodak, Steven Wu, Tuomas Sandholm

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate our meta-learning algorithms on endgames faced by the poker agent Libratus against top human professionals. The experiments show that games with varying stack sizes can be solved significantly faster using our meta-learning techniques than by solving them separately, often by an order of magnitude. and 4 EXPERIMENTS
Researcher Affiliation Collaboration Keegan Harris Carnegie Mellon University... Ioannis Anagnostides Carnegie Mellon University... Gabriele Farina FAIR, Meta AI... Mikhail Khodak Carnegie Mellon University... Zhiwei Steven Wu Carnegie Mellon University... Tuomas Sandholm Carnegie Mellon University Strategy Robot, Inc. Optimized Markets, Inc. Strategic Machine, Inc.
Pseudocode Yes see Algorithm 1 (in Appendix B) for pseudocode of the meta-version of OGD we consider.
Open Source Code No The paper mentions obtaining 'two public endgames that were released by the authors' from 'https://github.com/Sandholm-Lab/Libratus Endgames.', which refers to data used for experiments, not the open-source code for the methodology described in the paper. There is no explicit statement or link for their own code.
Open Datasets Yes We use the two public endgames that were released by the authors,3 denoted Endgame A and Endgame B, each corresponding to a zero-sum extensive-form game. For each of these endgames, we produced T := 200 individual tasks by varying the size of the stacks of each player according to three different task sequencing setups: [details of setups]. 3Obtained from https://github.com/Sandholm-Lab/Libratus Endgames.
Dataset Splits No The paper describes generating 200 individual tasks by varying stack sizes for two endgames and evaluating performance on these tasks, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instance types, used for running the experiments.
Software Dependencies No The paper mentions algorithms like OGD and a projection method described by Gilpin et al. (2012) and Farina et al. (2022), and a sparsification algorithm from Farina and Sandholm (2022), but does not provide specific version numbers for any software dependencies like programming languages or libraries.
Experiment Setup Yes For each game, players run m := 1000 iterations... We tried different learning rates for the players selected from the set {0.1, 0.01, 0.001}. Figure 1 illustrates our results for η := 0.01...