reproducibilityindex.ai

Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Authors: Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents. Particularly, using a single desktop machine, VACL achieves 98% coverage rate with 100 agents in the simple-spread benchmark and reproduces the ramp-use behavior originally shown in Open AI s hide-and-seek project.
Researcher Affiliation	Academia	1 Tsinghua University, 2 Shanghai Qi Zhi Institute, 3 University of Science and Technology Beijing, 4 Stanford University
Pseudocode	Yes	Algorithm 1: The VACL Algorithm
Open Source Code	Yes	Our project website is at https://sites.google.com/view/vacl-neurips-2021. (This project website links to a GitHub repository: https://github.com/PKU-RL/VACL)
Open Datasets	Yes	We consider four tasks over two environments, Simple-Spread and Push-Ball in the multi-agent particle-world environment (MPE) [19], and Ramp-Use and Lock-and-Return in the Mu Jo Co-based hide-and-seek environment (Hn S) [2].
Dataset Splits	No	The paper references datasets/environments (MPE, Hn S) but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproduction.
Hardware Specification	Yes	Every experiment is repeated over 3 seeds and performed on a desktop machine with one 64-core CPU and one 2080-Ti GPU, which is used for forward action computation and training updates.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) needed to replicate the experiment.
Experiment Setup	Yes	Every experiment is repeated over 3 seeds and performed on a desktop machine with one 64-core CPU and one 2080-Ti GPU, which is used for forward action computation and training updates. For PC-Unif and VACL, we start with n0 = 4 in Simple-Spread and n0 = 2 in Push-Ball and then switch to the desired agent number.