reproducibilityindex.ai

Structured Control Nets for Deep Reinforcement Learning

Authors: Mario Srouji, Jian Zhang, Ruslan Salakhutdinov

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validated our hypothesis with competitive results on simulations from Open AI Mu Jo Co, Roboschool, Atari, and a custom urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods.
Researcher Affiliation	Collaboration	1Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213. 2Apple Inc., 1 Inﬁnite Loop, Cupertino, CA 95014.
Pseudocode	No	The paper describes the architecture and experimental procedures in text but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured, code-like steps.
Open Source Code	No	The paper states, 'For PPO and ACKTR, we use the same hyper-parameters and algorithm implementation from Open AI Baselines (Dhariwal et al., 2017).' This refers to a third-party codebase used by the authors, not their own implementation code for the Structured Control Net (SCN). The paper does not provide any link or explicit statement about the availability of the SCN's source code.
Open Datasets	Yes	We conduct experiments on several benchmarks, shown in Figure 2, including Open AI Mu Jo Co v1 (Todorov et al., 2012), Open AI Roboschool v1 (Open AI, 2017), and Atari Games (Bellemare et al., 2013).
Dataset Splits	No	The paper mentions training networks for '2M timesteps' and 'averaged over 5 training runs with random seeds from 1 to 5', but it does not specify any explicit train/validation/test dataset splits or cross-validation methodology.
Hardware Specification	Yes	For our ES implementation, we use an efﬁcient shared-memory implementation on a single machine with 48 cores.
Software Dependencies	No	The paper states, 'For PPO and ACKTR, we use the same hyper-parameters and algorithm implementation from Open AI Baselines (Dhariwal et al., 2017).' While it names a software project, it does not provide specific version numbers for Open AI Baselines or any other software components (e.g., Python, TensorFlow, PyTorch libraries) used in the experiments.
Experiment Setup	Yes	For our ES implementation, we use an efﬁcient shared-memory implementation on a single machine with 48 cores. We set the noise standard deviation and learning rate as 0.1 and 0.01, respectively, and the number of workers to 30. [...] For each experiment, we trained each network for 2M timesteps and averaged over 5 training runs with random seeds from 1 to 5 to obtain each learning curve.