Measuring Mutual Policy Divergence for Multi-Agent Sequential Exploration
Authors: Haowen Dou, Lujuan Dang, Zhirong Luan, Badong Chen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the proposed method outperforms state-of-the-art sequential updating approaches in two challenging multi-agent tasks with various heterogeneous scenarios. |
| Researcher Affiliation | Academia | 1National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, 2National Engineering Research Center for Visual Information and Applications, 3Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University, 4School of Electrical Engineering, Xi an University of Technology |
| Pseudocode | Yes | Algorithm 1 Multi-Agent Divergence Policy Optimization |
| Open Source Code | Yes | Source code is available at https://github.com/hwdou6677/MADPO. |
| Open Datasets | Yes | We evaluate the proposed MADPO on two challenging multi-agent heterogeneous environments, Multi-Agent Mujoco (MA-Mujoco) [de Witt et al., 2020] and Bi-Dex Hands [Chen et al., 2022]. |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits, but rather refers to running experiments on different scenarios and tasks within the environments. |
| Hardware Specification | Yes | The experiments were conducted on a PC with NVIDIA RTX3090 GPU, Intel Xeon 64-core CPU, and 64GB Ram. |
| Software Dependencies | No | The paper mentions hyperparameters and implies the use of frameworks like PyTorch (common for this type of research) but does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | For MA-Mujoco, the commom hyperparameter are listed in Tab. 1, and the different hyperparameters in each scenarios are listed in Tab. 2. For Bi-Dex Hands, the commom hyperparameter are listed in Tab. 3, and the different hyperparameters in each scenarios are listed in Tab. 4. |