CUP: Critic-Guided Policy Reuse
Authors: Jin Zhang, Siyuan Li, Chongjie Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate that CUP achieves efficient transfer and significantly outperforms baseline algorithms. We evaluate CUP on Meta-World (Yu et al., 2020), a popular reinforcement learning benchmark composed of multiple robot arm manipulation tasks. Empirical results demonstrate that CUP achieves efficient transfer and significantly outperforms baseline algorithms. |
| Researcher Affiliation | Academia | Jin Zhang1, Siyuan Li2, Chongjie Zhang1 1Institute for Interdisciplinary Information Sciences, Tsinghua University, China 2School of Computer Science and Technology, Harbin Institute of Technology, China |
| Pseudocode | Yes | Algorithm 1 CUP |
| Open Source Code | Yes | Codes are available at https://github.com/Nagisa Zj/CUP. |
| Open Datasets | Yes | We evaluate on Meta-World (Yu et al., 2020), a popular reinforcement learning benchmark composed of multiple robot arm manipulation tasks. |
| Dataset Splits | No | The paper mentions evaluating on Meta-World and averaging results over six random seeds, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using SAC as the underlying algorithm but does not specify software versions for libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | All the results are averaged over six random seeds. CUP introduces only two additional hyper-parameters to the underlying SAC algorithm, and we further test CUP s sensitivity to these additional hyper-parameters. Additional implementation details are deferred to Appendix D.1. |