Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Modelling Behavioural Diversity for Learning in Open-Ended Games
Authors: Nicolas Perez-Nieves, Yaodong Yang, Oliver Slumbers, David H Mguni, Ying Wen, Jun Wang
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.Empirically, we evaluate our methods on tens of games that show strong non-transitivity, covering both normal-form games and open-ended games. Results confirm the superior performance of our methods, in terms of lower exploitability, against the state-of-the-art game solvers. |
| Researcher Affiliation | Collaboration | 1Huawei U.K. 2Imperial College London, work done during internship at Huawei U.K. 3University College London. |
| Pseudocode | Yes | We provide zero-order Oracle and RL-based Oracle as approximation solutions to Eq. (13), and list their pseudocode and time complexity in Appendix H. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We test our methods on the meta-games that are generated during the process of solving 28 real-world games (Czarnecki et al., 2020), including Alpha Star and Alpha GO. |
| Dataset Splits | No | The paper describes testing on games and meta-games, but it does not specify explicit training/validation/test dataset splits in the traditional sense. The evaluation is on game performance, not on pre-split fixed datasets. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory) for running its experiments. |
| Software Dependencies | No | The paper mentions general types of algorithms (e.g., 'RL algorithms') but does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.x' or 'Python 3.x'). |
| Experiment Setup | Yes | We provide an exhaustive list of hyper-parameter and reward settings in Appendix G. |