A Unified Diversity Measure for Multiagent Reinforcement Learning

Authors: Zongkai Liu, Chao Yu, Yaodong Yang, peng sun, Zifan Wu, Yuan Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our algorithms on games that show strong non-transitivity, and empirical results show that our algorithms achieve better performances than strong PSRO baselines in terms of the exploitability and population effectivity.
Researcher Affiliation Collaboration Zongkai Liu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China liuzk@mail2.sysu.edu.cn Chao Yu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China yuchao3@mail.sysu.edu.cn Yaodong Yang Institute for AI, Peking University Beijing Institute for General AI, Beijing yaodong.yang@pku.edu.cn Peng Sun Byte Dance, Shenzhen, China pengsun000@gmail.com Zifan Wu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China wuzf5@mail2.sysu.edu.cn
Pseudocode No The paper describes algorithms using mathematical equations and text, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Appendix A.4
Open Datasets No The paper mentions games like 'Alpha Star game', 'Blotto', and 'Non-Transitive Mixture Model', but does not provide specific links, DOIs, repository names, or formal citations for the exact datasets used in their experiments, nor does it explicitly state their public availability with access details.
Dataset Splits No The paper states 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.4', but the provided text does not contain explicit details about train/validation/test dataset splits with percentages or counts.
Hardware Specification No The paper explicitly states '[N/A]' for the question 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?'.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes More experimental settings and results can be found in Appendix A.4. The function in UDM-PSRO is a concave function f(x) = 1 1+exp( x) 1 2 F (see Appendix A.5 for more discussions on the selection of f and K).