Meta-Learning in Games
Authors: Keegan Harris, Ioannis Anagnostides, Gabriele Farina, Mikhail Khodak, Steven Wu, Tuomas Sandholm
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we evaluate our meta-learning algorithms on endgames faced by the poker agent Libratus against top human professionals. The experiments show that games with varying stack sizes can be solved significantly faster using our meta-learning techniques than by solving them separately, often by an order of magnitude. and 4 EXPERIMENTS |
| Researcher Affiliation | Collaboration | Keegan Harris Carnegie Mellon University... Ioannis Anagnostides Carnegie Mellon University... Gabriele Farina FAIR, Meta AI... Mikhail Khodak Carnegie Mellon University... Zhiwei Steven Wu Carnegie Mellon University... Tuomas Sandholm Carnegie Mellon University Strategy Robot, Inc. Optimized Markets, Inc. Strategic Machine, Inc. |
| Pseudocode | Yes | see Algorithm 1 (in Appendix B) for pseudocode of the meta-version of OGD we consider. |
| Open Source Code | No | The paper mentions obtaining 'two public endgames that were released by the authors' from 'https://github.com/Sandholm-Lab/Libratus Endgames.', which refers to data used for experiments, not the open-source code for the methodology described in the paper. There is no explicit statement or link for their own code. |
| Open Datasets | Yes | We use the two public endgames that were released by the authors,3 denoted Endgame A and Endgame B, each corresponding to a zero-sum extensive-form game. For each of these endgames, we produced T := 200 individual tasks by varying the size of the stacks of each player according to three different task sequencing setups: [details of setups]. 3Obtained from https://github.com/Sandholm-Lab/Libratus Endgames. |
| Dataset Splits | No | The paper describes generating 200 individual tasks by varying stack sizes for two endgames and evaluating performance on these tasks, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instance types, used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms like OGD and a projection method described by Gilpin et al. (2012) and Farina et al. (2022), and a sparsification algorithm from Farina and Sandholm (2022), but does not provide specific version numbers for any software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | For each game, players run m := 1000 iterations... We tried different learning rates for the players selected from the set {0.1, 0.01, 0.001}. Figure 1 illustrates our results for η := 0.01... |