Multiagent Evaluation under Incomplete Information

Authors: Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper investigates multiagent evaluation in the incomplete information regime, involving general-sum many-player games with noisy outcomes. We propose adaptive algorithms for accurate ranking, provide correctness and sample complexity guarantees, then introduce a means of connecting uncertainties in noisy match outcomes to uncertainties in rankings. We evaluate the performance of these approaches in several domains, including Bernoulli games, a soccer meta-game, and Kuhn poker.
Researcher Affiliation Collaboration Mark Rowland1, markrowland@google.com Shayegan Omidshafiei2, somidshafiei@google.com Karl Tuyls2 karltuyls@google.com Julien Pérolat1 perolat@google.com Michal Valko2 valkom@deepmind.com Georgios Piliouras3 georgios@sutd.edu.sg Rémi Munos2 munos@google.com 1Deep Mind London 2Deep Mind Paris 3 Singapore University of Technology and Design
Pseudocode Yes Algorithm 1 Response Graph UCB(δ, S, C(δ))
Open Source Code No No explicit statement about providing open-source code or a link to a code repository for the described methodology was found.
Open Datasets Yes Second, we analyze a Soccer meta-game with the payoffs in Liu et al. [33, Figure 2]... Finally, we consider a Kuhn poker meta-game with asymmetric payoffs and 3 players with access to 3 agents each, similar to the domain analyzed in [36]
Dataset Splits No No explicit train/validation/test dataset splits (percentages, absolute counts, or references to predefined splits with specific details) are provided. The paper discusses simulating noisy outcomes and sampling interactions.
Hardware Specification No No specific hardware details (such as CPU/GPU models, memory, or detailed cloud/cluster configurations) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper mentions 'Mu Jo Co simulation environment [46]' but does not provide a specific version number. No other specific software components with version numbers are listed.
Experiment Setup Yes In all domains, noisy outcomes are simulated by drawing the winning player i.i.d. from a Bernoulli(Mk(s)) distribution over payoff tables M. We build intuition by evaluating Response Graph UCB(δ : 0.1, S : UE, C : UCB), i.e., with a 90% confidence level, on a two-player game with payoffs shown in Fig. 4.1a. Due to the much larger strategy spaces of these games, we cap the number of samples available at 1e5.