Backpropagation Through Agents
Authors: Zhiyuan Li, Wenshuai Zhao, Lijun Wu, Joni Pajarinen
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on matrix games, Star Craft II v2, Multiagent Mu Jo Co, and Google Research Football demonstrate the effectiveness of the proposed method. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, University of Electronic Science and Technology of China 2Department of Electrical Engineering and Automation, Aalto University |
| Pseudocode | Yes | We provide the pseudo-code for BPPO in Algorithm 1. |
| Open Source Code | No | No explicit statement or link is provided for the open-source code of the methodology described in this paper. |
| Open Datasets | Yes | Extensive experiments on matrix games, Star Craft II v2, Multiagent Mu Jo Co, and Google Research Football demonstrate the effectiveness of the proposed method. |
| Dataset Splits | No | The paper conducts experiments in multi-agent reinforcement learning environments (matrix games, Star Craft II, Mu Jo Co, Google Research Football) where data is generated through interaction, not from a pre-defined static dataset with explicit train/validation/test splits. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions various algorithms and environments like PPO, SAC, Star Craft II, Mu Jo Co, and Google Research Football. However, it does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | No | The paper states 'More experimental details and results on these tasks are included in Appendix.' However, within the main body of the paper, specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) are not provided. |