reproducibilityindex.ai

Learning Distinguishable Trajectory Representation with Contrastive Loss

Authors: Tianxu Li, Kun Zhu, Juan Li, Yang Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement CTR on top of QMIX and evaluate its performance in various cooperative multi-agent tasks. The empirical results demonstrate that our proposed CTR yields signiﬁcant performance improvement over the state-of-the-art methods.
Researcher Affiliation	Academia	Tianxu Li1,2 Kun Zhu1,2, Juan Li1 Yang Zhang1 1College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China 2Collaborative Innovation Center of Novel Software Technology and Industrialization {tianxuli, zhukun, yangzhang, juanli}@nuaa.edu.cn
Pseudocode	Yes	We refer the reader to Appendix C for the Pytorch-style pseudocode of our proposed CTR. Algorithm 1: Py Torch-style pseudocode for CTR
Open Source Code	Yes	Our code can be found in the uploaded supplemental material.
Open Datasets	Yes	We evaluate CTR in Pac Men, SMAC, and SMACv2 benchmarks. The Star Craft Multi-Agent Challenge (SMAC) [Samvelyan et al., 2019] is a common-used benchmark for evaluating cooperative MARL algorithms. SMACv2 [Ellis et al., 2022] that enables stochasticity in SMAC scenarios via introducing random team compositions and random start positions.
Dataset Splits	No	The paper mentions 'test win rates' and '32 test episodes' but does not explicitly state any train/validation/test dataset splits by percentage or sample count, nor does it explicitly mention a 'validation' set or phase.
Hardware Specification	Yes	All experiments are performed using NVIDIA Ge Force RTX 4090 GPUs.
Software Dependencies	No	The paper states 'We implement our method with Num Py and Py Torch.' but does not provide specific version numbers for these software libraries.
Experiment Setup	Yes	The hyperparameters of CTR and baseline algorithms in Pac-Men, SMAC, and SMACv2 are listed in Table 4. We set the evaluation interval to 10K steps followed by 32 test episodes. We run all methods for 5 million steps. In both SMAC and SMACv2, the target networks are updated via hard updates every 200 episodes. In Pac-Men, the target networks use soft updates at a momentum rate of 0.01.