Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems
Authors: Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents. Particularly, using a single desktop machine, VACL achieves 98% coverage rate with 100 agents in the simple-spread benchmark and reproduces the ramp-use behavior originally shown in Open AI s hide-and-seek project. |
| Researcher Affiliation | Academia | 1 Tsinghua University, 2 Shanghai Qi Zhi Institute, 3 University of Science and Technology Beijing, 4 Stanford University |
| Pseudocode | Yes | Algorithm 1: The VACL Algorithm |
| Open Source Code | Yes | Our project website is at https://sites.google.com/view/vacl-neurips-2021. (This project website links to a GitHub repository: https://github.com/PKU-RL/VACL) |
| Open Datasets | Yes | We consider four tasks over two environments, Simple-Spread and Push-Ball in the multi-agent particle-world environment (MPE) [19], and Ramp-Use and Lock-and-Return in the Mu Jo Co-based hide-and-seek environment (Hn S) [2]. |
| Dataset Splits | No | The paper references datasets/environments (MPE, Hn S) but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproduction. |
| Hardware Specification | Yes | Every experiment is repeated over 3 seeds and performed on a desktop machine with one 64-core CPU and one 2080-Ti GPU, which is used for forward action computation and training updates. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) needed to replicate the experiment. |
| Experiment Setup | Yes | Every experiment is repeated over 3 seeds and performed on a desktop machine with one 64-core CPU and one 2080-Ti GPU, which is used for forward action computation and training updates. For PC-Unif and VACL, we start with n0 = 4 in Simple-Spread and n0 = 2 in Push-Ball and then switch to the desired agent number. |