State Entropy Maximization with Random Encoders for Efficient Exploration

Authors: Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and navigation tasks from Deep Mind Control Suite and Mini Grid benchmarks.
Researcher Affiliation Collaboration 1KAIST 2UC Berkeley 3University of Michigan 4LG AI Research.
Pseudocode Yes Algorithm 1 RE3: Off-policy RL version
Open Source Code Yes Source code is available at https://sites.google.com/view/re3-rl.
Open Datasets Yes RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on widely used Deep Mind Control Suite (Tassa et al., 2020), Mini Grid (Chevalier-Boisvert et al., 2018), and Atari (Bellemare et al., 2013) benchmarks.
Dataset Splits No The paper mentions using well-known benchmarks but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) for reproducibility within the paper's text.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') needed to replicate the experiments.
Experiment Setup Yes As for the newly introduced hyperparameters, we use k = 3, β0 {0.05, 0.25}, and ρ {0.0, 0.00001, 0.000025}. We provide more details in Appendix A.