reproducibilityindex.ai

Exploration via State influence Modeling

Authors: Yongxin Kang, Enmin Zhao, Kai Li, Junliang Xing8047-8054

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efﬁciency.
Researcher Affiliation	Academia	1 School of artiﬁcial intelligence, University of Chinese Academy of Sciences 2 Institute of Automation, Chinese Academy of Sciences
Pseudocode	Yes	Algorithm 1 SI-based RL framework
Open Source Code	No	The source code, trained models, and all the experimental results will be released to facilitate further studies on reinforcement learning in hard exploration tasks.
Open Datasets	Yes	Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efﬁciency. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016).
Dataset Splits	No	No specific information about training/validation/test dataset splits (e.g., percentages, counts, or explicit splitting methodology) was found.
Hardware Specification	No	No specific details about the hardware (e.g., CPU, GPU models, memory) used for running experiments were provided.
Software Dependencies	No	No specific software versions or library dependencies were mentioned for reproducibility (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	In all the evaluations conducted in experiment part, β is experimentally set to 1. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016). The position of the agent is then discretized into a state in 42 42, and we set the i SI(s) with the same settings in Eqn. (12) and Eqn. (13).