Learning Agent Communication under Limited Bandwidth by Message Pruning

Authors: Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, Yan Ni5142-5149

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the gating mechanism on several tasks. Experiments demonstrate that it can prune a lot of messages with little impact on performance. In fact, the performance may be greatly improved by pruning redundant messages. Moreover, the proposed gating mechanism is applicable to several previous methods, equipping them the ability to address bandwidth restricted settings.
Researcher Affiliation Collaboration Hangyu Mao,1 Zhengchao Zhang,1 Zhen Xiao,1 Zhibo Gong,2 Yan Ni1 1Peking University, 2Huawei Technologies Co., Ltd.
Pseudocode No The paper describes its methods and training procedures using textual descriptions and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing their code, nor does it provide a direct link to a source-code repository for the described methodology.
Open Datasets No The paper describes custom simulation environments (Traffic Control, Packet Routing, WifiAccess Point Configuration) for its experiments, rather than using or providing access to pre-existing public datasets with specific citations or links.
Dataset Splits No The paper describes training procedures but does not provide specific details regarding training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for data partitioning.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory amounts used for running its experiments. It does not mention cloud or cluster resources with specifications.
Software Dependencies No The paper mentions the use of Deep Reinforcement Learning (DRL) and Deep Neural Networks (DNN) frameworks but does not provide specific ancillary software details, such as library names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes For a fixed T, we firstly sort the ΔQ(oi) of the latest K observations oi encountered during training, resulting in a sorted list of ΔQ(oi). [...] For the threshold T, we propose two methods to set a fixed T and a dynamic T, respectively. [...] Tt = (1 β)Tt 1 + β( Qt( oi, a C i , o i, a C i ) Qt( oi, a I i , o i, a C i ) ) (10) where β is a coefficient for discounting older T, and the subscript t represents the training timestep. We test some β in [0.6, 0.9], and they all work well.