A Unified Diversity Measure for Multiagent Reinforcement Learning
Authors: Zongkai Liu, Chao Yu, Yaodong Yang, peng sun, Zifan Wu, Yuan Li
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our algorithms on games that show strong non-transitivity, and empirical results show that our algorithms achieve better performances than strong PSRO baselines in terms of the exploitability and population effectivity. |
| Researcher Affiliation | Collaboration | Zongkai Liu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China liuzk@mail2.sysu.edu.cn Chao Yu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China yuchao3@mail.sysu.edu.cn Yaodong Yang Institute for AI, Peking University Beijing Institute for General AI, Beijing yaodong.yang@pku.edu.cn Peng Sun Byte Dance, Shenzhen, China pengsun000@gmail.com Zifan Wu School of Computer Science and Engineering Sun Yat-sen University, Guangzhou, China wuzf5@mail2.sysu.edu.cn |
| Pseudocode | No | The paper describes algorithms using mathematical equations and text, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Appendix A.4 |
| Open Datasets | No | The paper mentions games like 'Alpha Star game', 'Blotto', and 'Non-Transitive Mixture Model', but does not provide specific links, DOIs, repository names, or formal citations for the exact datasets used in their experiments, nor does it explicitly state their public availability with access details. |
| Dataset Splits | No | The paper states 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.4', but the provided text does not contain explicit details about train/validation/test dataset splits with percentages or counts. |
| Hardware Specification | No | The paper explicitly states '[N/A]' for the question 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?'. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | More experimental settings and results can be found in Appendix A.4. The function in UDM-PSRO is a concave function f(x) = 1 1+exp( x) 1 2 F (see Appendix A.5 for more discussions on the selection of f and K). |