reproducibilityindex.ai

Value-Evolutionary-Based Reinforcement Learning

Authors: Pengyi Li, Jianye Hao, Hongyao Tang, Yan Zheng, Fazl Barez

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on Min Atar and Atari demonstrate the superiority of VEB-RL in significantly improving DQN, Rainbow, and SPR. This section empirically evaluates VEB-RL on a range of tasks.
Researcher Affiliation	Academia	1College of Intelligence and Computing, Tianjin University, China 2Edinburgh Centre for Robotics 3University of Oxford 4Centre for the Study of Existential Risk, University of Cambridge
Pseudocode	Yes	Algorithm 1 Value-Evolutionary-Based RL
Open Source Code	Yes	Our code is available on https://github.com/yeshenpy/VEB-RL.
Open Datasets	Yes	We first consider Min Atar benchmark (Young & Tian, 2019) which is a testbed of miniaturized versions of several Atari games. We further verify whether CEM-VEB-RL and GA-VEB-RL can further improve Rainbow on six popular tasks of Atari: Breakout, Space Invaders, Qbert, Pong, Battle Zone and Name This Game, in which agents need to take the high-dimensional pixel images as inputs.
Dataset Splits	No	The paper mentions "training process" and "training steps" but does not provide specific percentages or counts for train/validation/test splits, nor does it reference predefined splits with citations for reproducibility.
Hardware Specification	Yes	All experiments are carried out on NVIDIA GTX 2080 Ti GPU with Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz.
Software Dependencies	No	The paper mentions "Optimizer Adam" and refers to implementations from ERL and CEM-RL (with GitHub links), but it does not specify software dependencies like Python, PyTorch, or TensorFlow with their explicit version numbers.
Experiment Setup	Yes	Table 3. Details of setting. Parameter Value Optimizer Adam Learning rate 3e-4 Replay buffer size 1e5 Number of hidden layers for Q network 2 Number of hidden units per layer 1024, 128 Batch size 32 Number of the convolutional layer 1 Out channels of the convolutional layer 16 Kernel size of the convolutional layer 3 3 The stride of the convolutional layer 1 Discounted factor γ 0.99 Steps to update the target network 1000 Sample size for calculating the fitness N 5120 in Min Atar & 1024 in Atari Population size 10 Update frequency of target network in the population H 20