Neural Auto-Curricula in Two-Player Zero-Sum Games

Authors: Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results show that NAC can discover meaningful solution concepts alike NE, and based on that build effective auto-curricula in training agent populations. In multiple different environments, the discovered auto-curriculum achieves the same performance or better than that of PSRO methods [25, 3]. We additionally evaluate the ability of our discovered meta-solvers to generalise to unseen games of a similar type (e.g., training on Kuhn Poker and testing on Leduc Poker), and show that the auto-curricula found on a simple environment is able to generalise to a more difficult one. and 4 Experiments We validate the effectiveness of NAC on five types of zero-sum environments
Researcher Affiliation Academia Xidong Feng ,1, Oliver Slumbers ,1, Ziyu Wan2, Bo Liu3, Stephen Mc Aleer 4, Ying Wen2, Jun Wang1, Yaodong Yang ,5 1University College London, 2Shanghai Jiao Tong University, 3Institute of Automation, CAS, 4University of California, Irvine, 5Institute for AI, Peking University
Pseudocode Yes Algorithm 1 Neural Auto-Curricula (NAC)
Open Source Code Yes Code released at https:// github.com/waterhorse1/NAC
Open Datasets Yes We validate the effectiveness of NAC on five types of zero-sum environments5 with different levels of complexity. They are Games of Skill (Go S) [8], differentiable Lotto [3], non-transitive mixture game (2D-RPS) [37], iterated matching pennies (IMP) [15, 21] and Kuhn Poker [23].
Dataset Splits No The paper discusses training and testing on different games (e.g., Kuhn Poker for training and Leduc Poker for testing) but does not provide explicit train/validation/test dataset splits, percentages, or specific counts. The games themselves define the data.
Hardware Specification No The paper does not provide any specific details about the hardware used, such as GPU/CPU models, memory, or cloud computing instances.
Software Dependencies No Their implementations can be found in Open Spiel [24]. No version numbers are provided for Open Spiel or other software.
Experiment Setup Yes More details of all of the applied oracles and their hyper-parameters are in Appendix F, and details of the baseline implementations are in Appendix E.