MADiff: Offline Multi-agent Learning with Diffusion Models

Authors: Zhengbang Zhu, Minghuan Liu, Liyuan Mao, Bingyi Kang, Minkai Xu, Yong Yu, Stefano Ermon, Weinan Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that MADIFF outperforms baseline algorithms across various multi-agent learning tasks, highlighting its effectiveness in modeling complex multi-agent interactions.
Researcher Affiliation Collaboration 1 Shanghai Jiao Tong University, 2 Byte Dance, 3 Stanford University {zhengbangzhu, minghuanliu, maoliyuan, yyu, wnzhang}@sjtu.edu.cn, bingykang@gmail.com, {minkai, ermon}@cs.stanford.edu
Pseudocode Yes Algorithm 1 Multi-Agent Planning with MADIFF; Algorithm 2 Multi-Agent Trajectory Prediction with MADIFF
Open Source Code Yes We provide code, anonymous data download link, and necessary instructions in supplementary materials.
Open Datasets Yes NBA dataset: the dataset consists of various basketball players recorded trajectories from 631 games in the 2015-16 season. Following Alcorn and Nguyen [2021], we split 569/30/32 training/validation/test games, with downsampling from 25 Hz to 5Hz. We use the offline datasets constructed by Pan et al. [2022], including four datasets... we use the off-the-grid offline dataset [Formanek et al., 2023]...
Dataset Splits Yes NBA dataset: the dataset consists of various basketball players recorded trajectories from 631 games in the 2015-16 season. Following Alcorn and Nguyen [2021], we split 569/30/32 training/validation/test games, with downsampling from 25 Hz to 5Hz.
Hardware Specification Yes On a server equipped with an AMD Ryzen 9 5900X (12 cores) CPU and an RTX 3090 GPU, we trained the MADIFF-C model on the Expert dataset from the MPE Spread task, achieving convergence in approximately one hour. The results are obtained on a server with an AMD Ryzen 9 5900X (12 cores) CPU and an RTX 3090 GPU, and are averaged over 1000 trials.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as programming languages or libraries.
Experiment Setup Yes We list the key hyperparameters of MADIFF we used in Table 4, Table 5, and Table 6. [These tables include: Learning rate, Guidance scale ω, Planning horizon H, History horizon, Batch size, Diffusion steps K, Reward discount γ, Optimizer]