reproducibilityindex.ai

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

Authors: Zeyang Liu, Xinrui Yang, Shiguang Sun, Long Qian, Lipeng Wan, Xingyu Chen, Xuguang Lan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The empirical results demonstrate that this framework can improve the answers for multi-agent decision-making problems by showing superior performance on the training and unseen tasks of the Star Craft Multi-Agent Challenge benchmark. In particular, it can generate consistent interaction sequences and explainable reward functions at interaction states, opening the path for training generative models of the future.
Researcher Affiliation	Academia	Zeyang Liu zeyang.liu@stu.xjtu.edu.cn Xinrui Yang xinrui.yang@stu.xjtu.edu.cn Shiguang Sun ssg2019@stu.xjtu.edu.cn Long Qian qianlongym@stu.xjtu.edu.cn Lipeng Wan wanlipeng77@xjtu.edu.cn Xingyu Chen chenxingyu_1990@xjtu.edu.cn Xuguang Lan xglan@mail.xjtu.edu.cn National Key Laboratory of Human-Machine Hybrid Augmented Intelligence National Engineering Research Center for Visual Information and Application Institute of Artiﬁcial Intelligence and Robotics, Xi an Jiaotong University, Xi an, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	We choose not to release the data and code at present. We would like to have the opportunity to further engage with the research community and to ensure that any future such releases are respectful, safe, and responsible.
Open Datasets	Yes	The training maps include 3s5z, 1c3s5z, 10m_vs_11m, 2c_vs_64zg, 3s_vs_5z, 5m_vs_6m, 6h_vs_8z, 3s5z_vs_3s6z, corridor, MMM2 in Star Craft Multi-Agent Challenge (SMAC) (Samvelyan et al., 2019). We use EMC (Zheng et al., 2021) and IIE (Liu et al., 2024) to collect 50000 trajectories for each map and save these data as NPY ﬁles.
Dataset Splits	No	The paper describes using 'training maps' and 'unseen testing maps' but does not explicitly mention a separate 'validation split' or 'validation set' with specific proportions or counts for hyperparameter tuning.
Hardware Specification	Yes	In this paper, all experiments are implemented with Pytorch and executed on eight NVIDIA A800 GPUs.
Software Dependencies	No	The paper mentions 'Pytorch' but does not provide a specific version number for it or any other software dependencies crucial for replication.
Experiment Setup	Yes	We train our image tokenizer for 100k steps using the Adam W optimizer, with cosine decay, using the hyperparameters in Table 8. The batch size is 32, and the learning rate is 1e-4. ... We build our dynamics model implementation based on Decision Transformer12 (Chen et al., 2021). The complete list of hyperparameters can be found in Table 9.