PTDE: Personalized Training with Distilled Execution for Multi-Agent Reinforcement Learning

Authors: Yiqun Chen, Hangyu Mao, Jiaxin Mao, Shiguang Wu, Tianle Zhang, Bin Zhang, Wei Yang, Hongxing Chang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental PTDE can be seamlessly integrated with state-of-the-art algorithms, leading to notable performance enhancements across diverse benchmarks, including the SMAC benchmark, Google Research Football (GRF) benchmark, and Learning to Rank (LTR) task.
Researcher Affiliation Collaboration 1Renmin University of China 2Sense Time 3Noah s Ark Lab, Huawei 4JD Explore Academy 5Institute of Automation,Chinese Academy of Sciences
Pseudocode Yes Algorithm 1: The first training stage of PTDE
Open Source Code No The paper does not provide an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conducted training and testing on 10,000 queries (7:3 partition) from the MSLR-WEB30K [Qin and Liu, 2013] dataset
Dataset Splits Yes We conducted training and testing on 10,000 queries (7:3 partition) from the MSLR-WEB30K [Qin and Liu, 2013] dataset
Hardware Specification No The paper mentions using '8 parallel runners' but does not provide specific details about the GPU/CPU models, memory, or other hardware used for running the experiments.
Software Dependencies No The paper mentions using Py MARL2 framework [Hu et al., 2021] but does not specify its version or the versions of other key software components like Python, PyTorch, or CUDA.
Experiment Setup Yes Details regarding hyperparameters are available in Table 7 in the Appendix.