Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance
Authors: Jinmin He, Kai Li, Yifan Zang, Haobo Fu, Qiang Fu, Junliang Xing, Jian Cheng
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations demonstrate that incorporating CTPG with these approaches significantly enhances performance in manipulation and locomotion benchmarks. |
| Researcher Affiliation | Collaboration | 1Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Ai Ri A 4Tsinghua University 5Tencent AI Lab |
| Pseudocode | Yes | A Pseudo Code. Algorithm 1 Control Policy s Training Step. Algorithm 2 Guide Policy s Training Step. Algorithm 3 Cross-Task Policy Guidance. |
| Open Source Code | Yes | The full code is provided in supplemental material |
| Open Datasets | Yes | We conduct experiments on Meta World manipulation and Half Cheetah locomotion MTRL benchmarks... Meta World benchmark [31]... Half Cheetah Task Group [11] |
| Dataset Splits | No | The paper describes training with a certain number of samples and evaluating the final policy, but it does not explicitly specify train/validation/test dataset splits with percentages or sample counts for the data used during its own training process. |
| Hardware Specification | Yes | We use AMD EPYC 7742 64-Core Processor with NVIDIA Geforce RTX 3090 GPU for training. |
| Software Dependencies | No | The paper states: 'We implement all experiments using the MTRL codebase [21] 4', but does not specify exact version numbers for programming languages (e.g., Python) or specific libraries (e.g., PyTorch, TensorFlow) beyond citing the codebase. |
| Experiment Setup | Yes | F.3 Hyper-Parameters of All Method. Table 3: General hyper-parameters of all methods... Table 10: Additional hyper-parameters of guide policy in CTPG. |