reproducibilityindex.ai

Neural Episodic Control with State Abstraction

Authors: Zhuo Li, Derui Zhu, Yujing Hu, Xiaofei Xie, Lei Ma, YAN ZHENG, Yan Song, Yingfeng Chen, Jianjun Zhao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach to the Mu Jo Co and Atari tasks in Open AI gym domains. The experimental results indicate that NECSA achieves higher sample efficiency than the state-of-the-art episodic control-based approaches.
Researcher Affiliation	Collaboration	Zhuo Li1 Derui Zhu2 Yujing Hu3 Xiaofei Xie4 Lei Ma5,6, Yan Zheng7 Yan Song3 Yingfeng Chen3 Jianjun Zhao1 1Kyushu University 2Technical University of Munich 3Net Ease Fuxi AI Lab 4Singapore Management University 5University of Alberta 6The University of Tokyo 7Tianjin University
Pseudocode	Yes	Algorithm 1 NECSA.
Open Source Code	Yes	Our data and code are available at the project website1. 1https://sites.google.com/view/drl-necsa
Open Datasets	Yes	We conduct the experiments on nine Mu Jo Co tasks and six Atari games in Open AI gym (Brockman et al., 2016) domains.
Dataset Splits	No	The paper mentions training steps and evaluation results but does not explicitly describe training/validation/test dataset splits with specific percentages or sample counts for reproduction.
Hardware Specification	Yes	All the experiments were run on powerful servers with CPU (Intel(R) Core(TM) i9-10940X CPU @ 3.30GHz), and GPU(NVIDIA Corporation GA102GL [RTX A6000]) with 128GB RAM.
Software Dependencies	No	The paper states: 'We also adopt the implementation of DDPG, TD3, DQN, and Rainbow in (Weng et al., 2021).' and 'The implementation of NECSA is based on TD3.'. While Tianshou (Weng et al., 2021) is mentioned, specific version numbers for the Tianshou library itself or other software components like PyTorch or CUDA are not provided.
Experiment Setup	Yes	Our hyperparameter settings are listed in Table 1 and Table 2. ... For Mu Jo Co tasks, the neural network consists of two hidden layers. The size of each layer is 256. The activation unit is Re LU. Both the Actor and Critic networks share the same structure. For Atari tasks, we use a three-layer convolution neural network head and a fully connected layer to output the Q-values of each action.