An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Authors: Tianpei Yang, Weixun Wang, Hongyao Tang, Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our simulations show it significantly boosts the performance of existing approaches both in discrete and continuous state spaces.
Researcher Affiliation Collaboration 1College of Intelligence and Computing, Tianjin University {tpyang,wxwang,bluecontra,jianye.hao,mengzp}@tju.edu.cn 2Noah s Ark Lab, Huawei, {maohangyu1,lidong106,liuwulong}@huawei.com 3Dalian Maritime University, chenvy@dlmu.edu.cn 4Net Ease Fuxi AI Lab, {huyujing, chenyingfeng1, fanchangjie}@corp.netease.com
Pseudocode Yes Algorithm 1 MAPTF-PPO; Algorithm 2 SRO Learning.
Open Source Code Yes source code is provided on https: //github.com/tianpeiyang/MAPTF_code.
Open Datasets Yes We evaluate the performance of MAPTF combined with the popular single-agent RL algorithm (PPO [29]) and MARL algorithm (MADDPG [25] and QMIX [28]) on two representative multiagent games, Pac-Man [31] and multiagent particle environment (MPE) [25].
Dataset Splits No The paper describes the environments used (Pac-Man, MPE) and states that results are "averaged over 10 seeds," but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for data partitioning within these environments, which is typical for RL papers where data is generated through interaction rather than pre-defined static sets.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models, memory, or specific cloud instances.
Software Dependencies No The paper mentions combining MAPTF with PPO, MADDPG, and QMIX, and states "source code is provided on https: //github.com/tianpeiyang/MAPTF_code.", but it does not list specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup No The paper mentions that "More experimental details and parameters settings are detailed in the appendix," but these details are not provided within the main body of the paper. Thus, the main text does not explicitly provide hyperparameter values or system-level training settings.