Learning Agent Communication under Limited Bandwidth by Message Pruning
Authors: Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, Yan Ni5142-5149
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the gating mechanism on several tasks. Experiments demonstrate that it can prune a lot of messages with little impact on performance. In fact, the performance may be greatly improved by pruning redundant messages. Moreover, the proposed gating mechanism is applicable to several previous methods, equipping them the ability to address bandwidth restricted settings. |
| Researcher Affiliation | Collaboration | Hangyu Mao,1 Zhengchao Zhang,1 Zhen Xiao,1 Zhibo Gong,2 Yan Ni1 1Peking University, 2Huawei Technologies Co., Ltd. |
| Pseudocode | No | The paper describes its methods and training procedures using textual descriptions and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing their code, nor does it provide a direct link to a source-code repository for the described methodology. |
| Open Datasets | No | The paper describes custom simulation environments (Traffic Control, Packet Routing, WifiAccess Point Configuration) for its experiments, rather than using or providing access to pre-existing public datasets with specific citations or links. |
| Dataset Splits | No | The paper describes training procedures but does not provide specific details regarding training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for data partitioning. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory amounts used for running its experiments. It does not mention cloud or cluster resources with specifications. |
| Software Dependencies | No | The paper mentions the use of Deep Reinforcement Learning (DRL) and Deep Neural Networks (DNN) frameworks but does not provide specific ancillary software details, such as library names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | For a fixed T, we firstly sort the ΔQ(oi) of the latest K observations oi encountered during training, resulting in a sorted list of ΔQ(oi). [...] For the threshold T, we propose two methods to set a fixed T and a dynamic T, respectively. [...] Tt = (1 β)Tt 1 + β( Qt( oi, a C i , o i, a C i ) Qt( oi, a I i , o i, a C i ) ) (10) where β is a coefficient for discounting older T, and the subscript t represents the training timestep. We test some β in [0.6, 0.9], and they all work well. |