Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
PORTAL: Automatic Curricula Generation for Multiagent Reinforcement Learning
Authors: Jizhou Wu, Jianye Hao, Tianpei Yang, Xiaotian Hao, Yan Zheng, Weixun Wang, Matthew E. Taylor
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that PORTAL can train agents to master extremely hard cooperative tasks, which can not be achieved with previous state-of-the-art MARL algorithms. In this section, we study the following research questions (RQs) via comprehensive experiments. |
| Researcher Affiliation | Collaboration | 1College of Intelligence and Computing, Tianjin University 2University of Alberta and Alberta Machine Intelligence Institute 3Netease Fuxi AI Lab |
| Pseudocode | Yes | Algorithm 1: PORTAL |
| Open Source Code | Yes | Code and appendix: https://github.com/TJU-DRLLAB/transfer-and-multi-task-reinforcement-learning |
| Open Datasets | No | The paper mentions using "Starcraft Multi Agent Challenge (SMAC) (Samvelyan et al. 2019)" as the benchmark, but does not provide concrete access information like a link, DOI, or specific repository for the dataset itself. While SMAC is a well-known benchmark, the paper doesn't explicitly state its public availability with a direct source. |
| Dataset Splits | No | The paper mentions "test win rate" but does not explicitly specify training, validation, or test splits by percentages, counts, or references to predefined splits within the SMAC environment. It describes task series and scenarios but not how the data within those tasks is split for training/validation/testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions using "HPN-VDN (Hao et al. 2022)" as a backbone model and refers to other MARL algorithms, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, specific libraries, or their versions). |
| Experiment Setup | No | The paper states "detailed model structure and hyperparameter settings are in Appendix B" but does not include these details in the main text. The main text describes the experimental setup in terms of environments and baselines, but lacks concrete hyperparameter values or training configurations. |