Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MADiff: Offline Multi-agent Learning with Diffusion Models

Authors: Zhengbang Zhu, Minghuan Liu, Liyuan Mao, Bingyi Kang, Minkai Xu, Yong Yu, Stefano Ermon, Weinan Zhang

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that MADIFF outperforms baseline algorithms across various multi-agent learning tasks, highlighting its effectiveness in modeling complex multi-agent interactions.
Researcher Affiliation	Collaboration	1 Shanghai Jiao Tong University, 2 Byte Dance, 3 Stanford University EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Multi-Agent Planning with MADIFF; Algorithm 2 Multi-Agent Trajectory Prediction with MADIFF
Open Source Code	Yes	We provide code, anonymous data download link, and necessary instructions in supplementary materials.
Open Datasets	Yes	NBA dataset: the dataset consists of various basketball players recorded trajectories from 631 games in the 2015-16 season. Following Alcorn and Nguyen [2021], we split 569/30/32 training/validation/test games, with downsampling from 25 Hz to 5Hz. We use the offline datasets constructed by Pan et al. [2022], including four datasets... we use the off-the-grid offline dataset [Formanek et al., 2023]...
Dataset Splits	Yes	NBA dataset: the dataset consists of various basketball players recorded trajectories from 631 games in the 2015-16 season. Following Alcorn and Nguyen [2021], we split 569/30/32 training/validation/test games, with downsampling from 25 Hz to 5Hz.
Hardware Specification	Yes	On a server equipped with an AMD Ryzen 9 5900X (12 cores) CPU and an RTX 3090 GPU, we trained the MADIFF-C model on the Expert dataset from the MPE Spread task, achieving convergence in approximately one hour. The results are obtained on a server with an AMD Ryzen 9 5900X (12 cores) CPU and an RTX 3090 GPU, and are averaged over 1000 trials.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as programming languages or libraries.
Experiment Setup	Yes	We list the key hyperparameters of MADIFF we used in Table 4, Table 5, and Table 6. [These tables include: Learning rate, Guidance scale ω, Planning horizon H, History horizon, Batch size, Diffusion steps K, Reward discount γ, Optimizer]