Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
Authors: Yunfei Li, Tian Gao, Jiaqi Yang, Huazhe Xu, Yi Wu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on a variety of goal-conditioned control problems, including relatively simple benchmarks such as pushing and ant-maze navigation, and a challenging sparse-reward cube-stacking task. |
| Researcher Affiliation | Academia | 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA 3Stanford University, CA, USA 4Shanghai Qi Zhi Institute, Shanghai, China. |
| Pseudocode | Yes | Algorithm 1 Phasic Self-Imitative Reduction |
| Open Source Code | Yes | The project webpage is at https://sites.google.com/view/pair-gcrl. |
| Open Datasets | Yes | Push is a robotic pushing environment adopted from (Nair et al., 2018b) simulated with Mu Jo Co (Todorov et al., 2012) engine. |
| Dataset Splits | No | The paper describes a phasic training approach with online RL and offline SL, and mentions collecting data for training. However, it does not provide specific percentages or counts for train/validation/test dataset splits, nor does it refer to predefined splits for these purposes. |
| Hardware Specification | Yes | All the experiments are repeated over 3 random seeds on a single desktop machine with a GTX3090 GPU. |
| Software Dependencies | No | The paper mentions software like Mu Jo Co, Py Bullet, and algorithms like PPO and Adam optimizer with citations to their original papers, but does not provide explicit version numbers for these software packages or other common libraries (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | All the hyper-parameters are listed in Table 2. |