Situation-Dependent Causal Influence-Based Cooperative Multi-Agent Reinforcement Learning
Authors: Xiao Du, Yutong Ye, Pengyu Zhang, Yaning Yang, Mingsong Chen, Ting Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on various MARL benchmarks demonstrate the superiority of our method compared to state-of-the-art approaches. |
| Researcher Affiliation | Academia | Xiao Du, Yutong Ye, Pengyu Zhang, Yaning Yang, Mingsong Chen, Ting Wang* Software Engineering Institute, East China Normal University {52265902007, 52205902007, pengyu.zhang, 52215902011}@stu.ecnu.edu.cn, {mschen, twang}@sei.ecnu.edu.cn |
| Pseudocode | Yes | Algorithm 1: Training algorithm |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate our proposed approach on three benchmark multi-agent tasks: Partial Observation Cooperative Predator Prey, Cooperative Navigation, and Cooperative Line Control. The benchmarks environment is implemented in a Multi-Agent Particle Environment ((Lowe et al. 2017)), |
| Dataset Splits | No | The paper mentions training and evaluating on benchmark multi-agent tasks but does not specify exact train/validation/test dataset splits or percentages. |
| Hardware Specification | Yes | All algorithms are trained in a Linux server with a 2.30 GHz Xeon(R) CPU and two Nvidia 4090 graphics cards. |
| Software Dependencies | No | The paper mentions various algorithms and environments (e.g., MADDPG, Multi-Agent Particle Environment) but does not provide specific version numbers for software dependencies or libraries like Python, PyTorch, etc. |
| Experiment Setup | Yes | The learning rates of the critic network and the actor network are set to 0.001. The discount factor γ is set to 0.95. Each episode lasts up to 25 timesteps. To estimate the transition marginal distribution p(sj t+1 si t), the number K of per Monte-Carlo sample is set to 64. |