Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning

Authors: Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao7219-7226

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on several challenging tasks (i.e., packet routing, wificonfiguration and Google football player control) justify the superior performance of our methods compared with state-of-the-art MARL approaches.
Researcher Affiliation Collaboration 1Peking University, 2Noah s Ark Lab, Huawei 3Tianjin University, 4University College London
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide an explicit statement about releasing open-source code or a link to a code repository.
Open Datasets Yes For continuous actions, we adopt the packet routing tasks proposed by ATT-MADDPG (Mao et al. 2019) to ensure fair comparison. For discrete actions, we test wificonfiguration and Google football environments. ... We refer the readers to (Kurach et al. 2019) for the details [of Google Football].
Dataset Splits No The paper describes the experimental environments and scenarios (e.g., packet routing, wificonfiguration, Google Football 2-vs-2, 3-vs-2) but does not provide specific details on how the data within these environments are split into training, validation, and test sets (e.g., percentages, sample counts, or explicit standard splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific solver versions).
Experiment Setup Yes In our experiments, the weights of CD-loss are α = 0.1, α = 0.1 and α = 0.2 for packet routing, wificonfiguration and Google football, respectively. In order to accelerate learning in Google football, we reduce the action set to {topright-move, bottom-right-move, high-pass, shot} during exploration, and we give a reward 1.0, 0.7 and 0.3 to the last three actions respectively if the agents score a goal.