reproducibilityindex.ai

Self-Supervised Exploration via Disagreement

Authors: Deepak Pathak, Dhiraj Gandhi, Abhinav Gupta

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the efﬁcacy of this formulation across a variety of benchmark environments including stochastic-Atari, Mujoco and Unity. Finally, we implement our differentiable exploration on a real robot which learns to interact with objects completely from scratch.
Researcher Affiliation	Collaboration	Deepak Pathak * 1 Dhiraj Gandhi * 2 Abhinav Gupta 2 3 1UC Berkelely 2CMU 3Facebook AI Research.
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code	Yes	Project videos and code are at https://pathak22.github.io/exploration-by-disagreement/.
Open Datasets	Yes	We demonstrate the efﬁcacy of this formulation across a variety of benchmark environments including stochastic-Atari, Mujoco and Unity. Finally, we implement our differentiable exploration on a real robot which learns to interact with objects completely from scratch. Project videos and code are at https://pathak22.github.io/exploration-by-disagreement/.
Dataset Splits	Yes	Out of a total of 30 objects, we created a set of 20 objects for training and 10 objects for testing.
Hardware Specification	No	The paper does not provide specific details on the computational hardware (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper mentions software like PPO, Mujoco, Unity ML-agent, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In particular, we use random feature space in all video games and navigation, classiﬁcation features in MNIST and Image Net-pretrained Res Net-18 features in real world robot experiments. We use 5 models in the ensemble.