reproducibilityindex.ai

Multiagent Evaluation under Incomplete Information

Authors: Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper investigates multiagent evaluation in the incomplete information regime, involving general-sum many-player games with noisy outcomes. We propose adaptive algorithms for accurate ranking, provide correctness and sample complexity guarantees, then introduce a means of connecting uncertainties in noisy match outcomes to uncertainties in rankings. We evaluate the performance of these approaches in several domains, including Bernoulli games, a soccer meta-game, and Kuhn poker.
Researcher Affiliation	Collaboration	Mark Rowland1, markrowland@google.com Shayegan Omidshaﬁei2, somidshafiei@google.com Karl Tuyls2 karltuyls@google.com Julien Pérolat1 perolat@google.com Michal Valko2 valkom@deepmind.com Georgios Piliouras3 georgios@sutd.edu.sg Rémi Munos2 munos@google.com 1Deep Mind London 2Deep Mind Paris 3 Singapore University of Technology and Design
Pseudocode	Yes	Algorithm 1 Response Graph UCB(δ, S, C(δ))
Open Source Code	No	No explicit statement about providing open-source code or a link to a code repository for the described methodology was found.
Open Datasets	Yes	Second, we analyze a Soccer meta-game with the payoffs in Liu et al. [33, Figure 2]... Finally, we consider a Kuhn poker meta-game with asymmetric payoffs and 3 players with access to 3 agents each, similar to the domain analyzed in [36]
Dataset Splits	No	No explicit train/validation/test dataset splits (percentages, absolute counts, or references to predefined splits with specific details) are provided. The paper discusses simulating noisy outcomes and sampling interactions.
Hardware Specification	No	No specific hardware details (such as CPU/GPU models, memory, or detailed cloud/cluster configurations) used for running the experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions 'Mu Jo Co simulation environment [46]' but does not provide a specific version number. No other specific software components with version numbers are listed.
Experiment Setup	Yes	In all domains, noisy outcomes are simulated by drawing the winning player i.i.d. from a Bernoulli(Mk(s)) distribution over payoff tables M. We build intuition by evaluating Response Graph UCB(δ : 0.1, S : UE, C : UCB), i.e., with a 90% conﬁdence level, on a two-player game with payoffs shown in Fig. 4.1a. Due to the much larger strategy spaces of these games, we cap the number of samples available at 1e5.