Learning Nearly Decomposable Value Functions Via Communication Minimization
Authors: Tonghan Wang*, Jianhao Wang*, Chongyi Zheng, Chongjie Zhang
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate that, on the Star Craft unit micromanagement benchmark, our framework significantly outperforms baseline methods and allows us to cut off more than 80% of communication without sacrificing the performance. |
| Researcher Affiliation | Academia | 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Turing AI Institute of Nanjing, Nanjing, China |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper only provides a link to videos of experiments (https://sites.google.com/view/ndq) and does not state that the source code for the methodology is openly available or provide a link to a code repository. |
| Open Datasets | Yes | We demonstrate the effectiveness of our learning framework on Star Craft II1 unit micromanagement benchmark used in Foerster et al. (2017; 2018); Rashid et al. (2018); Samvelyan et al. (2019). |
| Dataset Splits | No | The paper mentions training and testing, but does not explicitly describe a validation dataset split (e.g., percentages or counts for a validation set). |
| Hardware Specification | Yes | We train our models on NVIDIA RTX 2080Ti GPUs using experience sampled from 16 parallel environments. |
| Software Dependencies | No | The paper mentions basing the implementation on the PyMARL framework but does not provide specific version numbers for software dependencies such as Python, PyTorch, or PyMARL itself. |
| Experiment Setup | Yes | We use the same hyper-parameter setting for NDQ on all maps: β is set to 10 5, λ is set to 0.1, and the length of message mij is set to 3. |