reproducibilityindex.ai

Task-Agnostic Dynamics Priors for Deep Reinforcement Learning

Authors: Yilun Du, Karthic Narasimhan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform two empirical studies to evaluate our hypothesis. First, we evaluate various frame prediction models, including our proposed Spatial Net, in terms of their capacity to predict future states and model physical interactions (Sections 4.1 and 4.2). Then, we investigate the use of these dynamics predictors for policy learning in different environments (Section 4.3).
Researcher Affiliation	Collaboration	1Massachusetts Institute of Technology (Work partially done at Open AI) 2Princeton University.
Pseudocode	No	The paper describes the architecture of Spatial Net in Section 3.2 and presents a diagram in Figure 2, but does not provide formal pseudocode or algorithm blocks.
Open Source Code	No	No statement about making the source code publicly available or providing a link to a code repository was found.
Open Datasets	Yes	Finally, we also evaluate on a stochastic variant of the popular ALE framework consisting of Atari games (Machado et al., 2017a).
Dataset Splits	Yes	We generate 5000 different trajectories in total 4500 for training a dynamics predictor and 500 for testing with each trajectory having a length of 125 steps.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments were provided in the paper.
Software Dependencies	No	The paper mentions 'Pymunk' and 'Bullet' (Coumans, 2010) as tools used, but does not provide specific version numbers for these or any other software dependencies like libraries or frameworks.
Experiment Setup	Yes	We use the Adam optimizer (Kingma and Ba, 2015) in our experiments with a learning rate of 10 4. ... We use the Adam optimizer with learning rate 10 4 to train model predictions and the same set of hyper-parameters for training all policy agents as those used in (Schulman et al., 2017).