reproducibilityindex.ai

Curiosity-driven Exploration by Self-supervised Prediction

Authors: Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed approach is evaluated in two environments: Viz Doom and Super Mario Bros. Three broad settings are investigated: 1) sparse extrinsic reward, where curiosity allows for far fewer interactions with the environment to reach the goal; 2) exploration with no extrinsic reward, where curiosity pushes the agent to explore more efﬁciently; and 3) generalization to unseen scenarios (e.g. new levels of the same game) where the knowledge gained from earlier experience helps the agent explore new places much faster than starting from scratch.
Researcher Affiliation	Academia	Deepak Pathak 1 Pulkit Agrawal 1 Alexei A. Efros 1 Trevor Darrell 1 1University of California, Berkeley. Correspondence to: Deepak Pathak <pathak@berkeley.edu>.
Pseudocode	No	The paper describes the proposed method using textual descriptions and a system diagram (Figure 2) but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	At the end of the paper, it states: 'To further aid in this effort, we will make the code for our algorithm, as well as testing and environment setups freely available online.' This indicates a future release, not concrete access at the time of publication.
Open Datasets	Yes	Our ﬁrst environment is the Viz Doom (Kempka et al., 2016) game...Our testing setup in all the experiments is the Doom My Way Home-v0 environment which is available as part of Open AI Gym (Brockman et al., 2016). Our second environment is the classic Nintendo game Super Mario Bros with a reparamterized 14 dimensional action space following (Paquette, 2016).
Dataset Splits	No	The paper describes varying reward sparsity and spawning locations for experimental scenarios but does not provide specific percentages or counts for training, validation, and test dataset splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments, only mentioning general environments like Viz Doom and the use of A3C.
Software Dependencies	No	The paper mentions software components and environments such as 'Open AI Gym', 'Viz Doom', and 'A3C' along with citations, but it does not specify explicit version numbers for these software dependencies (e.g., 'Open AI Gym vX.Y.Z').
Experiment Setup	No	The paper defines scaling factors (η) and weights (β, λ) used in the loss function, and mentions maximum time steps for episodes, but it does not provide concrete hyperparameter values such as learning rates, batch sizes, or specific optimizer settings for the A3C agent.