Modelling Behavioural Diversity for Learning in Open-Ended Games

Authors: Nicolas Perez-Nieves, Yaodong Yang, Oliver Slumbers, David H Mguni, Ying Wen, Jun Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.Empirically, we evaluate our methods on tens of games that show strong non-transitivity, covering both normal-form games and open-ended games. Results confirm the superior performance of our methods, in terms of lower exploitability, against the state-of-the-art game solvers.
Researcher Affiliation Collaboration 1Huawei U.K. 2Imperial College London, work done during internship at Huawei U.K. 3University College London.
Pseudocode Yes We provide zero-order Oracle and RL-based Oracle as approximation solutions to Eq. (13), and list their pseudocode and time complexity in Appendix H.
Open Source Code No The paper does not provide an explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes We test our methods on the meta-games that are generated during the process of solving 28 real-world games (Czarnecki et al., 2020), including Alpha Star and Alpha GO.
Dataset Splits No The paper describes testing on games and meta-games, but it does not specify explicit training/validation/test dataset splits in the traditional sense. The evaluation is on game performance, not on pre-split fixed datasets.
Hardware Specification No The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory) for running its experiments.
Software Dependencies No The paper mentions general types of algorithms (e.g., 'RL algorithms') but does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.x' or 'Python 3.x').
Experiment Setup Yes We provide an exhaustive list of hyper-parameter and reward settings in Appendix G.