Episodic Curiosity through Reachability
Authors: Nikolay Savinov, Anton Raichuk, Damien Vincent, Raphael Marinier, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our approach in visually rich 3D environments in Viz Doom, DMLab and Mu Jo Co. In navigational tasks from Viz Doom and DMLab, our agent outperforms the state-of-the-art curiosity method ICM. In Mu Jo Co, an ant equipped with our curiosity module learns locomotion out of the first-person-view curiosity only. The code is available at https://github.com/google-research/episodic-curiosity. |
| Researcher Affiliation | Collaboration | 1Google Brain, 2Deep Mind, 3ETH Z urich |
| Pseudocode | No | The paper describes algorithmic steps in prose and uses diagrams but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/google-research/episodic-curiosity. |
| Open Datasets | Yes | We test our method in multiple environments from Viz Doom (Kempka et al., 2016), DMLab (Beattie et al., 2016) and Mu Jo Co (Todorov et al., 2012; Schulman et al., 2015). |
| Dataset Splits | Yes | As DMLab environments are procedurally generated, we perform tuning on the validation set, disjoint with the training and test sets. |
| Hardware Specification | No | The paper does not specify any particular CPU, GPU, or TPU models used for running the experiments. It only mentions general computing environments like 'PPO (same as in the main text of the paper)' for Mu Jo Co. |
| Software Dependencies | No | The paper mentions using the 'PPO algorithm from the open-source implementation2' (footnote 2: 'https://github.com/openai/baselines') and 'gym-mujoco6' (footnote 6: 'https://gym.openai.com/envs/Ant-v2/') but does not provide specific version numbers for software dependencies like Python, TensorFlow/PyTorch, or other libraries. |
| Experiment Setup | Yes | The hyperparameters of the PPO algorithm are given in the supplementary material. We use only two sets of hyperparameters: one for all Viz Doom environments and the other one for all DMLab environments. |