For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

Authors: Scott Emmons, Caspar Oesterheld, Andrew Critch, Vincent Conitzer, Stuart Russell

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To answer this question, we present an empirical analysis of learning symmetric strategy profiles in the GAMUT suite of game generators (Nudelman et al., 2004). We are interested both in how centralized optimization algorithms (such as gradient methods) search for symmetric strategies and in how decentralized populations of agents evolve symmetric strategies.
Researcher Affiliation Academia 1University of California, Berkeley 2Carnegie Mellon University 3Duke University, University of Oxford.
Pseudocode No The paper describes algorithms like SLSQP and replicator dynamics but does not provide structured pseudocode or algorithm blocks for them.
Open Source Code Yes All of our code is available at https://github.com/scottemmons/coordination under the MIT License.
Open Datasets Yes We ran experiments in all three classes of symmetric GAMUT games: Random Game, Coordination Game, and Collaboration Game. (Nudelman et al., 2004).
Dataset Splits No The paper does not provide explicit training/validation/test dataset splits in the conventional machine learning sense, as it focuses on simulations with game generators rather than fixed datasets.
Hardware Specification Yes To test a large number of random seeds, we ran our experiments for a few days on an Amazon Web Services c5.24xlarge instance.
Software Dependencies No Our code uses the following Python libraries: Matplotlib (Hunter, 2007), Num Py (Harris et al., 2020), pandas (Reback et al., 2021; Wes Mc Kinney, 2010), Sci Py (Virtanen et al., 2020), and Sym Py (Meurer et al., 2017). Specific version numbers for these libraries are not explicitly provided in the text.
Experiment Setup Yes For each game class, we sweep the parameters of the game from 2 to 5 players and 2 to 5 actions, i.e., with (|N|, |Ai|) {2, 3, 4, 5} {2, 3, 4, 5}. We sample 100 games at each parameter setting and then attempt to calculate the global symmetric optimum using (i) 10 runs of SLSQP and (ii) 10 runs of the replicator dynamic (each with a different initialization drawn uniformly at random over the simplex), resulting in 10 + 10 = 20 solution attempts per game. ... Experimentally, we found the fastest convergence with a stepsize of 1, and we found that 100 iterations sufficed for convergence; see Figure 3. For good measure, we ran 10,000 iterations of the replicator dynamic in all of our experiments.