Deep Coordination Graphs
Authors: Wendelin Boehmer, Vitaly Kurin, Shimon Whiteson
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we compare the performance of DCG with various topologies (see Table 1) to the state-of-the-art algorithms QTRAN (Son et al., 2019), QMIX (Rashid et al., 2018), VDN (Sunehag et al., 2018) and IQL (Tan, 1993). We evaluate these methods in two complex grid-world tasks and challenging Starcraft II micromanagement tasks from the Star Craft Multi-Agent Challenge (SMAC, Samvelyan et al., 2019). |
| Researcher Affiliation | Academia | Wendelin B ohmer 1 Vitaly Kurin 1 Shimon Whiteson 1 1Department of Computer Science, Oxford University, United Kingdom. |
| Pseudocode | Yes | The detailed procedures of computing the tensors (Algorithm 1), the Q-value (Algorithm 2) and greedy action selection (Algorithm 3) are given in the appendix. |
| Open Source Code | Yes | An open-source implementation of DCG and all discussed algorithms and tasks is available for full reproducibility3. 3 https://github.com/wendelinboehmer/dcg |
| Open Datasets | No | The paper describes using custom grid-world predator-prey tasks and Star Craft II micromanagement tasks (SMAC). While SMAC is a well-known benchmark, the paper does not provide explicit access information (link, DOI, citation with author/year) for these environments as 'datasets' in the sense of static data collections. It uses them as simulation environments for RL training and evaluation. |
| Dataset Splits | No | The paper evaluates performance on 'greedy test episodes' and 'win rate of test episodes' for RL tasks. However, it does not specify explicit training, validation, or test dataset splits in terms of percentages or sample counts for static datasets, as is typical for supervised learning. Data is generated dynamically through environment interaction. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions general terms without specific models. |
| Software Dependencies | No | The paper mentions that 'All algorithms are implemented in the PYMARL framework (Samvelyan et al., 2019)' but does not specify version numbers for PyMARL or any other software libraries or dependencies, such as Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | The paper describes key aspects of the experimental setup, including the use of deep Q-learning (DQN), experience replay buffer, target networks, and Double Q-learning. It also details the nature of the tasks: 'We evaluate these methods in two complex grid-world tasks and challenging Starcraft II micromanagement tasks'. |