reproducibilityindex.ai

Potential Driven Reinforcement Learning for Hard Exploration Tasks

Authors: Enmin Zhao, Shihong Deng, Yifan Zang, Yongxin Kang, Kai Li, Junliang Xing

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental analyses and comparisons on multiple challenging hard exploration environments have veriﬁed its effectiveness and efﬁciency. and To verify the efﬁciency of the Pot ER sampling algorithm, we evaluate its performance on two separate domains: 1) a simple maze with discrete state space (Fig. 3 left) and 2) the hard exploration Atari games with continuous state space.
Researcher Affiliation	Academia	1Institute of Automation, Chinese Academy of Sciences 2School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences {zhaoenmin2018, shihong.deng, zangyifan2019, kangyongxin2018, kai.li, junliang.xing}@ia.ac.cn
Pseudocode	Yes	Algorithm 1: Pot ER based RL with SIL.
Open Source Code	Yes	The source code of this work is available at https: //github.com/Zhao En Min/Pot ER.
Open Datasets	No	The paper mentions standard Atari games like 'Montezuma s Revenge', 'Freeway', 'Gravitar', and 'Private Eye' as experimental environments, but it does not provide concrete access information (specific links, DOIs, repositories, or formal citations) for datasets (e.g., ROMs or specific data files) or the custom maze environment used.
Dataset Splits	No	The paper specifies the number of random seeds used for experiments (e.g., '3 random seeds' for maze, '5 random seeds in 50M time steps' for Atari) but does not provide explicit train/validation/test dataset splits, percentages, or sample counts.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or other detailed computer specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, etc.) used in the experiments.
Experiment Setup	Yes	The main hyper-parameters of our algorithm are the number of iterations used to set goals N, the inﬂuence distance of the repulsive potential ﬁeld do, the attractive parameter ka and the repulsive parameter kr. Because in different games, agents have different average steps to lose health, we set N as 50... we set kr to and ka to any positive value... In speciﬁc, for the maze games, we set do to 1. For the Atari games, we set do to 10. and In the Atari experiments, we convert the 84 84 input RGB frames to gray-scale images. The input of the convolutional neural networks... are the last 4 stacked gray-scale frames. For the SIL and SIL+Pot ER algorithms, we perform four SIL updates in training each model.