Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning
Authors: Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further conduct experiments on four environments including both discrete and continuous action settings on both existing and our man-made datasets, demonstrating that CFCQL outperforms existing methods on most datasets and even with a remarkable margin on some of them. |
| Researcher Affiliation | Academia | Jianzhun Shao , Yun Qu , Chen Chen, Hongchang Zhang, Xiangyang Ji Department of Automation Tsinghua University, Beijing, China {sjz18, qy22, hc-zhang19}@mails.tsinghua.edu.cn cclvr@163.com xyji@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 CFCQL-D and CFCQL-C |
| Open Source Code | Yes | Our code and datasets are available at: https://github.com/thu-rllab/CFCQL |
| Open Datasets | Yes | With datasets collected by Pan et al. [34] and ourselves, our method outperforms existing methods in most settings and even with a large margin on some of them. and Our code and datasets are available at: https://github.com/thu-rllab/CFCQL |
| Dataset Splits | No | The paper describes how datasets were collected (e.g., 'The datasets are made based on the training process or trained model of QMIX[37]') but does not explicitly state train/validation/test splits by percentages, absolute counts, or by referencing predefined standard splits for their experiments. |
| Hardware Specification | Yes | We use 2 servers to run all the experiments. Each one has 8*NVIDIA RTX 3090 GPUs, and 2*AMD 7H12 CPUs. Each setting is repeated for 5 seeds. |
| Software Dependencies | No | The paper refers to using various open-source implementations (e.g., 'from Lowe et al. [27]', 'from Samvelyan et al. [40]') and general tools like 'Q-learning' or 'TD3', but does not provide specific version numbers for software dependencies or libraries (e.g., 'Python 3.8', 'PyTorch 1.9'). |
| Experiment Setup | Yes | Please refer to this repository12 for the code, datasets and the hyper-parameters of our method. |