Exploration via State influence Modeling

Authors: Yongxin Kang, Enmin Zhao, Kai Li, Junliang Xing8047-8054

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efficiency.
Researcher Affiliation Academia 1 School of artificial intelligence, University of Chinese Academy of Sciences 2 Institute of Automation, Chinese Academy of Sciences
Pseudocode Yes Algorithm 1 SI-based RL framework
Open Source Code No The source code, trained models, and all the experimental results will be released to facilitate further studies on reinforcement learning in hard exploration tasks.
Open Datasets Yes Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efficiency. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016).
Dataset Splits No No specific information about training/validation/test dataset splits (e.g., percentages, counts, or explicit splitting methodology) was found.
Hardware Specification No No specific details about the hardware (e.g., CPU, GPU models, memory) used for running experiments were provided.
Software Dependencies No No specific software versions or library dependencies were mentioned for reproducibility (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes In all the evaluations conducted in experiment part, β is experimentally set to 1. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016). The position of the agent is then discretized into a state in 42 42, and we set the i SI(s) with the same settings in Eqn. (12) and Eqn. (13).