reproducibilityindex.ai

Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning

Authors: Yunfei Li, Tian Gao, Jiaqi Yang, Huazhe Xu, Yi Wu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on a variety of goal-conditioned control problems, including relatively simple benchmarks such as pushing and ant-maze navigation, and a challenging sparse-reward cube-stacking task.
Researcher Affiliation	Academia	1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA 3Stanford University, CA, USA 4Shanghai Qi Zhi Institute, Shanghai, China.
Pseudocode	Yes	Algorithm 1 Phasic Self-Imitative Reduction
Open Source Code	Yes	The project webpage is at https://sites.google.com/view/pair-gcrl.
Open Datasets	Yes	Push is a robotic pushing environment adopted from (Nair et al., 2018b) simulated with Mu Jo Co (Todorov et al., 2012) engine.
Dataset Splits	No	The paper describes a phasic training approach with online RL and offline SL, and mentions collecting data for training. However, it does not provide specific percentages or counts for train/validation/test dataset splits, nor does it refer to predefined splits for these purposes.
Hardware Specification	Yes	All the experiments are repeated over 3 random seeds on a single desktop machine with a GTX3090 GPU.
Software Dependencies	No	The paper mentions software like Mu Jo Co, Py Bullet, and algorithms like PPO and Adam optimizer with citations to their original papers, but does not provide explicit version numbers for these software packages or other common libraries (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup	Yes	All the hyper-parameters are listed in Table 2.