Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

End-to-End Training of Deep Visuomotor Policies

Authors: Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on a range of real-world manipulation tasks that require close coordination between vision and control, such as screwing a cap onto a bottle, and present simulated comparisons to a range of prior policy search methods.
Researcher Affiliation	Academia	Sergey Levine EMAIL Chelsea Finn EMAIL Trevor Darrell EMAIL Pieter Abbeel EMAIL Division of Computer Science University of California Berkeley, CA 94720-1776, USA
Pseudocode	No	The paper describes the algorithmic steps and equations for the guided policy search, BADMM, and trajectory optimization in sections 3, 4, and Appendix A. However, there is no clearly labeled block titled 'Pseudocode' or 'Algorithm' with structured steps.
Open Source Code	No	The paper mentions supplementary videos and the use of the Caffe deep learning library, but does not provide specific links to their own implementation code for the described methodology. It states: 'All of the robotic experiments discussed in this section may be viewed in the corresponding supplementary video, available online: http://rll.berkeley.edu/icra2015gps. A video illustration of the visuomotor policies, discussed in the following sections, is also available: http://sites.google.com/site/visuomotorpolicy.' and 'We used the Caﬀe deep learning library (Jia et al., 2014) for CNN training.'
Open Datasets	Yes	Since the training set is still small (we use 1000 images collected from random arm motions), we initialize the ﬁlters in the ﬁrst layer with weights from the model of Szegedy et al. (2014), which is trained on Image Net (Deng et al., 2009) classiﬁcation.
Dataset Splits	No	The paper describes different experimental conditions such as 'training target positions and grasps', 'new target positions not seen during training and, for the hammer, new grasps (spatial test)', and 'training positions with visual distractors (visual test)'. It also mentions 'The policies were trained on four diﬀerent hole positions, and then tested on four new hole positions to evaluate generalization.' However, it does not provide specific percentages or sample counts for training, validation, and test splits of a single dataset. It refers to 'number of trials per test' in Figure 9, which represents evaluation conditions rather than data partitioning methodology.
Hardware Specification	Yes	All of the robotic experiments were conducted on a PR2 robot. The robot was controlled at 20 Hz via direct eﬀort control,5 and camera images were recorded using the RGB camera on a Prime Sense Carmine sensor.
Software Dependencies	No	The paper mentions 'We used the Caﬀe deep learning library (Jia et al., 2014) for CNN training.' and 'All of the simulated experiments used the Mu Jo Co simulation package (Todorov et al., 2012)'. However, no specific version numbers for Caffe or MuJoCo are provided.
Experiment Setup	Yes	Our CNNs have 92,000 parameters and 7 layers, including a novel spatial feature point transformation... Our visuomotor policy runs at 20 Hz on the robot... The visual processing layers of the network consist of three convolutional layers... The third convolutional layer contains 32 response maps with resolution 109x109. These response maps are passed through a spatial softmax function... The spatial feature points (fcx, fcy) are concatenated with the robot’s conﬁguration and fed into two fully connected layers, each with 40 rectiﬁed units, followed by linear connections to the torques... We use a step size of α = 0.1 in all of our experiments... The weights νt are initialized to 0.01... The 2D peg insertion task has 6 state dimensions... Trials were 8 seconds in length and simulated at 100 Hz... The cost function is given by ℓ(xt, ut) = 1/2wu \|\|ut\|\|^2 + wpℓ12(pxt − p)... The weights were set to wu = 10^-6 and wp = 1.