reproducibilityindex.ai

Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance

Authors: Jinmin He, Kai Li, Yifan Zang, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations demonstrate that incorporating CTPG with these approaches significantly enhances performance in manipulation and locomotion benchmarks.
Researcher Affiliation	Collaboration	1Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Ai Ri A 4Tsinghua University 5Tencent AI Lab
Pseudocode	Yes	A Pseudo Code. Algorithm 1 Control Policy s Training Step. Algorithm 2 Guide Policy s Training Step. Algorithm 3 Cross-Task Policy Guidance.
Open Source Code	Yes	The full code is provided in supplemental material
Open Datasets	Yes	We conduct experiments on Meta World manipulation and Half Cheetah locomotion MTRL benchmarks... Meta World benchmark [31]... Half Cheetah Task Group [11]
Dataset Splits	No	The paper describes training with a certain number of samples and evaluating the final policy, but it does not explicitly specify train/validation/test dataset splits with percentages or sample counts for the data used during its own training process.
Hardware Specification	Yes	We use AMD EPYC 7742 64-Core Processor with NVIDIA Geforce RTX 3090 GPU for training.
Software Dependencies	No	The paper states: 'We implement all experiments using the MTRL codebase [21] 4', but does not specify exact version numbers for programming languages (e.g., Python) or specific libraries (e.g., PyTorch, TensorFlow) beyond citing the codebase.
Experiment Setup	Yes	F.3 Hyper-Parameters of All Method. Table 3: General hyper-parameters of all methods... Table 10: Additional hyper-parameters of guide policy in CTPG.