reproducibilityindex.ai

Multi-Agent Domain Calibration with a Handful of Offline Data

Authors: Tao Jiang, Lei Yuan, Lihe Li, Cong Guan, Zongzhang Zhang, Yang Yu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation on 21 offline locomotion tasks in D4RL and Neo RL benchmarks showcases the superior performance of our method compared to strong existing offline model-based RL, offline domain calibration, and hybrid offline-and-online RL baselines.
Researcher Affiliation	Collaboration	1National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China 3Polixir Technologies, Nanjing, China
Pseudocode	Yes	The pseudo-code of Madoc is presented in Alg. 1, we utilize SAC [59] and DOP [49] as our backbone algorithms for domain calibration.
Open Source Code	Yes	The source code is available at https: //github.com/LAMDA-RL/Madoc.
Open Datasets	Yes	On the popular D4RL benchmark [60], we choose four locomotion tasks (Half Cheetah, Hopper, Walker2d, Ant), each with three types of datasets (medium, medium-replay, medium-expert), to evaluate different algorithms performance when faced with datasets of varying quality. Considering more challenging scenarios, three environments (Half Cheetah, Hopper, Walker2d) along with three levels of datasets (low, medium, high) from Neo RL benchmark [61] are also selected.
Dataset Splits	Yes	As illustrated in Fig. 4(a), the algorithms access datasets of different magnitudes, 5 104 (small), 2 105 (medium), and 1 106 (large), to reflect a spectrum of data availability.
Hardware Specification	Yes	Most experiments were conducted on a server outfitted with a 13th Gen Intel(R) Core(TM) i9-13900K CPU, 2 NVIDIA RTX A5000 GPUs, and 125GB of RAM, running Ubuntu 22.04.
Software Dependencies	No	The paper mentions “Ubuntu 22.04” as the operating system and refers to various algorithms and frameworks like SAC and DOP, but it does not specify version numbers for other key software libraries or dependencies (e.g., PyTorch, TensorFlow, NumPy).
Experiment Setup	Yes	We list the default hyper-parameter settings for Madoc in Tab. 5.