Exploration via State influence Modeling
Authors: Yongxin Kang, Enmin Zhao, Kai Li, Junliang Xing8047-8054
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efficiency. |
| Researcher Affiliation | Academia | 1 School of artificial intelligence, University of Chinese Academy of Sciences 2 Institute of Automation, Chinese Academy of Sciences |
| Pseudocode | Yes | Algorithm 1 SI-based RL framework |
| Open Source Code | No | The source code, trained models, and all the experimental results will be released to facilitate further studies on reinforcement learning in hard exploration tasks. |
| Open Datasets | Yes | Extensive experimental analyses and comparisons in Grid Maze and many hard exploration Atari 2600 games demonstrate its high exploration efficiency. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016). |
| Dataset Splits | No | No specific information about training/validation/test dataset splits (e.g., percentages, counts, or explicit splitting methodology) was found. |
| Hardware Specification | No | No specific details about the hardware (e.g., CPU, GPU models, memory) used for running experiments were provided. |
| Software Dependencies | No | No specific software versions or library dependencies were mentioned for reproducibility (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | In all the evaluations conducted in experiment part, β is experimentally set to 1. In the Atari experiments, we convert the 210 160 input RGB frames to grayscale images and resize them to 42 42 images following the practice in (Tang et al. 2017; Bellemare et al. 2016). The position of the agent is then discretized into a state in 42 42, and we set the i SI(s) with the same settings in Eqn. (12) and Eqn. (13). |