Exploration via Elliptical Episodic Bonuses
Authors: Mikael Henaff, Roberta Raileanu, Minqi Jiang, Tim Rocktäschel
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments For all our experiments we use the Torchbeast [40] implementation of IMPALA [23] as our base RL algorithm. For certain skill-based tasks, we restricted the action space to the necessary actions for solving the task at hand, since we found that none of the methods were able to make progress with the full action space (see Appendix C.1.4). See Appendix C.1.3 for environment details and C.1 for other experiment details. |
| Researcher Affiliation | Collaboration | Mikael Henaff Meta AI Research mikaelhenaff@meta.com Roberta Raileanu Meta AI Research raileanu@meta.com Minqi Jiang University College London Meta AI Research meta@fb.com Tim Rocktäschel University College London t.rocktaschel@cs.ucl.ac.uk |
| Pseudocode | Yes | Algorithm 1 Exploration via Episodic Elliptical Bonuses (E3B) |
| Open Source Code | Yes | Our code is available at https://github.com/facebookresearch/e3b. |
| Open Datasets | Yes | We use the HM3D dataset [57], which contains high-quality renditions of 1000 different indoor spaces. |
| Dataset Splits | Yes | The 1000 scenes of the HM3D dataset are split into 800 training, 100 validation, and 100 test scenes based on the splits released with the dataset. |
| Hardware Specification | Yes | All experiments were run on an internal cluster of machines equipped with 8 V100 GPUs and 192GB RAM. |
| Software Dependencies | No | The information is insufficient as the paper mentions software like 'Torchbeast' and 'PyTorch' but does not specify version numbers for these or other dependencies. |
| Experiment Setup | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix C. |