Modelling Behavioural Diversity for Learning in Open-Ended Games
Authors: Nicolas Perez-Nieves, Yaodong Yang, Oliver Slumbers, David H Mguni, Ying Wen, Jun Wang
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.Empirically, we evaluate our methods on tens of games that show strong non-transitivity, covering both normal-form games and open-ended games. Results confirm the superior performance of our methods, in terms of lower exploitability, against the state-of-the-art game solvers. |
| Researcher Affiliation | Collaboration | 1Huawei U.K. 2Imperial College London, work done during internship at Huawei U.K. 3University College London. |
| Pseudocode | Yes | We provide zero-order Oracle and RL-based Oracle as approximation solutions to Eq. (13), and list their pseudocode and time complexity in Appendix H. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We test our methods on the meta-games that are generated during the process of solving 28 real-world games (Czarnecki et al., 2020), including Alpha Star and Alpha GO. |
| Dataset Splits | No | The paper describes testing on games and meta-games, but it does not specify explicit training/validation/test dataset splits in the traditional sense. The evaluation is on game performance, not on pre-split fixed datasets. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory) for running its experiments. |
| Software Dependencies | No | The paper mentions general types of algorithms (e.g., 'RL algorithms') but does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.x' or 'Python 3.x'). |
| Experiment Setup | Yes | We provide an exhaustive list of hyper-parameter and reward settings in Appendix G. |