reproducibilityindex.ai

Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning

Authors: Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further conduct experiments on four environments including both discrete and continuous action settings on both existing and our man-made datasets, demonstrating that CFCQL outperforms existing methods on most datasets and even with a remarkable margin on some of them.
Researcher Affiliation	Academia	Jianzhun Shao , Yun Qu , Chen Chen, Hongchang Zhang, Xiangyang Ji Department of Automation Tsinghua University, Beijing, China {sjz18, qy22, hc-zhang19}@mails.tsinghua.edu.cn cclvr@163.com xyji@tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1 CFCQL-D and CFCQL-C
Open Source Code	Yes	Our code and datasets are available at: https://github.com/thu-rllab/CFCQL
Open Datasets	Yes	With datasets collected by Pan et al. [34] and ourselves, our method outperforms existing methods in most settings and even with a large margin on some of them. and Our code and datasets are available at: https://github.com/thu-rllab/CFCQL
Dataset Splits	No	The paper describes how datasets were collected (e.g., 'The datasets are made based on the training process or trained model of QMIX[37]') but does not explicitly state train/validation/test splits by percentages, absolute counts, or by referencing predefined standard splits for their experiments.
Hardware Specification	Yes	We use 2 servers to run all the experiments. Each one has 8NVIDIA RTX 3090 GPUs, and 2AMD 7H12 CPUs. Each setting is repeated for 5 seeds.
Software Dependencies	No	The paper refers to using various open-source implementations (e.g., 'from Lowe et al. [27]', 'from Samvelyan et al. [40]') and general tools like 'Q-learning' or 'TD3', but does not provide specific version numbers for software dependencies or libraries (e.g., 'Python 3.8', 'PyTorch 1.9').
Experiment Setup	Yes	Please refer to this repository12 for the code, datasets and the hyper-parameters of our method.