MADE: Exploration via Maximizing Deviation from Explored Regions
Authors: Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph E. Gonzalez, Stuart Russell
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies. When tested on navigation and locomotion tasks from Mini Grid and Deep Mind Control Suite benchmarks, our approach significantly improves sample efficiency over state-of-the-art methods. |
| Researcher Affiliation | Collaboration | University of California, Berkeley {tianjunz,paria.rashidinejad,jiantao,russell}@berkeley.edu 2Facebook AI Research yuandong@fb.com |
| Pseudocode | Yes | Algorithm 1 Policy computation for adaptively regularized objective |
| Open Source Code | Yes | Our code is available at https://github.com/tianjunz/MADE. |
| Open Datasets | Yes | When tested in the procedurally-generated Mini Grid environments [19], MADE manages to converge... In Deep Mind Control Suite [95], we build upon... |
| Dataset Splits | No | The paper mentions training details are in the supplemental material, but the main text does not specify exact train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper explicitly states 'No' when asked in the checklist 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?' and no specific hardware details are provided in the main text. |
| Software Dependencies | No | The paper mentions several software components and algorithms used (e.g., IMPALA, RAD, Dreamer, RND, ICM) but does not provide specific version numbers for these, nor for any programming languages or libraries. |
| Experiment Setup | Yes | Details on experiments and hyperparameters are provided in Appendix B. |