reproducibilityindex.ai

Episodic Curiosity through Reachability

Authors: Nikolay Savinov, Anton Raichuk, Damien Vincent, Raphael Marinier, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our approach in visually rich 3D environments in Viz Doom, DMLab and Mu Jo Co. In navigational tasks from Viz Doom and DMLab, our agent outperforms the state-of-the-art curiosity method ICM. In Mu Jo Co, an ant equipped with our curiosity module learns locomotion out of the ﬁrst-person-view curiosity only. The code is available at https://github.com/google-research/episodic-curiosity.
Researcher Affiliation	Collaboration	1Google Brain, 2Deep Mind, 3ETH Z urich
Pseudocode	No	The paper describes algorithmic steps in prose and uses diagrams but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/google-research/episodic-curiosity.
Open Datasets	Yes	We test our method in multiple environments from Viz Doom (Kempka et al., 2016), DMLab (Beattie et al., 2016) and Mu Jo Co (Todorov et al., 2012; Schulman et al., 2015).
Dataset Splits	Yes	As DMLab environments are procedurally generated, we perform tuning on the validation set, disjoint with the training and test sets.
Hardware Specification	No	The paper does not specify any particular CPU, GPU, or TPU models used for running the experiments. It only mentions general computing environments like 'PPO (same as in the main text of the paper)' for Mu Jo Co.
Software Dependencies	No	The paper mentions using the 'PPO algorithm from the open-source implementation2' (footnote 2: 'https://github.com/openai/baselines') and 'gym-mujoco6' (footnote 6: 'https://gym.openai.com/envs/Ant-v2/') but does not provide specific version numbers for software dependencies like Python, TensorFlow/PyTorch, or other libraries.
Experiment Setup	Yes	The hyperparameters of the PPO algorithm are given in the supplementary material. We use only two sets of hyperparameters: one for all Viz Doom environments and the other one for all DMLab environments.