State Entropy Maximization with Random Encoders for Efficient Exploration
Authors: Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and navigation tasks from Deep Mind Control Suite and Mini Grid benchmarks. |
| Researcher Affiliation | Collaboration | 1KAIST 2UC Berkeley 3University of Michigan 4LG AI Research. |
| Pseudocode | Yes | Algorithm 1 RE3: Off-policy RL version |
| Open Source Code | Yes | Source code is available at https://sites.google.com/view/re3-rl. |
| Open Datasets | Yes | RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on widely used Deep Mind Control Suite (Tassa et al., 2020), Mini Grid (Chevalier-Boisvert et al., 2018), and Atari (Bellemare et al., 2013) benchmarks. |
| Dataset Splits | No | The paper mentions using well-known benchmarks but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) for reproducibility within the paper's text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') needed to replicate the experiments. |
| Experiment Setup | Yes | As for the newly introduced hyperparameters, we use k = 3, β0 {0.05, 0.25}, and ρ {0.0, 0.00001, 0.000025}. We provide more details in Appendix A. |