Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
Authors: Haoran Li, Zhennan Jiang, YUHUI CHEN, Dongbin Zhao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | CP3ER achieves new state-of-the-art (SOTA) performance in 21 tasks across Deep Mind control suite and Meta-world. In this section, we evaluate the proposed method from the following aspects: 1) Does CP3ER have performance advantages compared to current SOTA methods? 2) Can policy regularization improve the behavior of the policy? 3) What is the impact of different modules on the performance? |
| Researcher Affiliation | Academia | Haoran Li Institute of Automation, Chinese Academy of Sciences University of Chinese Academy of Sciences lihaoran2015@ia.ac.cn |
| Pseudocode | Yes | We have demonstrated the complete procedure of CP3ER in Algorithm 1, 2 and 3. |
| Open Source Code | Yes | Our project page is hosted at https://jzndd.github.io/CP3ER-Page/. Answer: [Yes] We have submitted the code for training as the supplementary material, and once the paper is accepted, the code will be fully open source. |
| Open Datasets | Yes | We evaluate the methods on 21 visual control tasks from Deep Mind control suite [58] and Meta-world [59]. |
| Dataset Splits | No | No explicit mention of train/validation/test dataset splits in the traditional sense, as it's a reinforcement learning paper where the agent interacts with environments. The paper refers to 'seed frames' and 'n-step returns' but not dataset splits. |
| Hardware Specification | Yes | All evaluations are based on a single NVIDIA Ge Force RTX 2080 Ti. |
| Software Dependencies | No | The paper mentions optimizer Adam (Table 1), which is typically from a deep learning framework like PyTorch or TensorFlow, but does not specify any software versions for libraries or frameworks used for implementation. |
| Experiment Setup | Yes | We present a summary of all the hyperparameters for CP3ER in Table 1, where DMC is the abbreviation of Deep Mind control suite. Table 1: The hyper-parameters for CP3ER. |