Learning Nearly Decomposable Value Functions Via Communication Minimization

Authors: Tonghan Wang*, Jianhao Wang*, Chongyi Zheng, Chongjie Zhang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate that, on the Star Craft unit micromanagement benchmark, our framework significantly outperforms baseline methods and allows us to cut off more than 80% of communication without sacrificing the performance.
Researcher Affiliation Academia 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Turing AI Institute of Nanjing, Nanjing, China
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper only provides a link to videos of experiments (https://sites.google.com/view/ndq) and does not state that the source code for the methodology is openly available or provide a link to a code repository.
Open Datasets Yes We demonstrate the effectiveness of our learning framework on Star Craft II1 unit micromanagement benchmark used in Foerster et al. (2017; 2018); Rashid et al. (2018); Samvelyan et al. (2019).
Dataset Splits No The paper mentions training and testing, but does not explicitly describe a validation dataset split (e.g., percentages or counts for a validation set).
Hardware Specification Yes We train our models on NVIDIA RTX 2080Ti GPUs using experience sampled from 16 parallel environments.
Software Dependencies No The paper mentions basing the implementation on the PyMARL framework but does not provide specific version numbers for software dependencies such as Python, PyTorch, or PyMARL itself.
Experiment Setup Yes We use the same hyper-parameter setting for NDQ on all maps: β is set to 10 5, λ is set to 0.1, and the length of message mij is set to 3.