Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Authors: Yang Zhang, Xinran Li, Jianing Ye, Shuang Qiu, Delin Qu, Xiu Li, Chongjie Zhang, Chenjia Bai
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate DIMA on challenging continuous MARL benchmarks, including MAMu Jo Co [22] and Bi-Dex Hands [23], in low-data regimes. Experimental results show that DIMA consistently improves the prediction accuracy of environment dynamics and outperforms both model-free and strong model-based MARL baselines in terms of final return and sample efficiency. |
| Researcher Affiliation | Collaboration | 1Tsinghua University, 2The Hong Kong University of Science and Technology, 3Washington University in St. Louis, 4City University of Hong Kong, 5Fudan University, 6Institute of Artificial Intelligence (Tele AI), China Telecom, 7Shenzhen Research Institute of Northwestern Polytechnical University |
| Pseudocode | Yes | We summarize the overall training procedure of DIMA paired with learning in imaginations in Algorithm 1 below. We denote as D the replay databuffer which stores data collected from the real environment. |
| Open Source Code | Yes | Codes are open-sourced at https://github.com/breez3young/DIMA. |
| Open Datasets | Yes | We evaluate our method on two widely-used multi-agent continuous control benchmarks requiring heterogeneous-agent cooperation: Multi-Agent Mu Jo Co (MAMu Jo Co) [22] and Bimanual Dexterous Hands (Bi-Dex Hands) [23]. |
| Dataset Splits | No | To highlight the sample efficiency of learning in imaginations, we adopt a low-data regime [66], limiting real-environment samples to 1M for MAMu Jo Co and 300k for Bi-Dex Hands, adjusted for their different episode lengths. |
| Hardware Specification | Yes | All our experiments including the evaluation of chosen baselines are run on a machine with a single NVIDIA RTX 4090 GPU, a 24-core CPU, and 256GB RAM. |
| Software Dependencies | No | Our implementation is based on the open-source repository: https://github.com/lucidrains/ vector-quantize-pytorch. |
| Experiment Setup | Yes | Table 4: Behaviour learning hyperparameters. Table 9: Architecture details. Table 10: Hyperparameters for DIMA. |