Learning to Manipulate Unknown Objects in Clutter by Reinforcement

Authors: Abdeslam Boularias, James Bagnell, Anthony Stentz

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The performance of our system is assessed on a robot with real-world objects. ... We performed extensive tests of the presented system using a WAM robotic arm equipped with a Barrett hand and a time-of-flight camera (Kinect).
Researcher Affiliation Academia Abdeslam Boularias and J. Andrew Bagnell and Anthony Stentz The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 USA {abdeslam, dbagnell, tony}@andrew.cmu.edu
Pseudocode No The paper includes a 'System Overview' diagram (Figure 1) which illustrates the work-flow, but it does not present any structured pseudocode or algorithm blocks.
Open Source Code No The paper states: 'For transparency, unedited videos of all the experiments have been uploaded to http://goo.gl/ze1Sqq '. This link is for videos, not source code. No explicit statement about code availability is made.
Open Datasets No The paper describes experiments with 'real-world objects' and collected data ('Given data set Dt = {(si, ai, ri, si+1)|i [0, t[} of observed states, executed actions, and received rewards up to current time t'), but it does not mention using any publicly available dataset nor does it provide access information for the collected data.
Dataset Splits Yes Finally, the kernel bandwidths are tuned in a leave-one-sequence-out cross-validation (Section 7).
Hardware Specification No The paper mentions the robotic setup: 'We performed extensive tests of the presented system using a WAM robotic arm equipped with a Barrett hand and a time-of-flight camera (Kinect).' This describes the robotic components but does not specify the computing hardware (CPU, GPU models, memory, etc.) used to run the experiments and computations.
Software Dependencies No The paper mentions specific algorithms/libraries like 'The CHOMP algorithm (Ratliff et al. 2009)' and 'a library of compliant hand motions with force-feedback (Kazemi et al. 2012)', as well as 'k-means', 'Mean-Shift', and 'spectral clustering'. However, no specific version numbers for any of these software components or libraries are provided.
Experiment Setup Yes γ is set to 0.5 in our experiments. ... α is a constant, set to 0.1 in all our experiments. ... ϵgrasp and ϵpush are hyper-parameters that cannot be manually tuned... We set ϵpush to 0 and search for ϵ grasp {ξgrasp/2n}, n = 0, . . . , 10, that has the lowest average Bellman error. The best threshold is further tuned by performing a grid-search in the interval [ϵ grasp, 2ϵ grasp]. ϵ push is obtained using a similar approach, with ϵgrasp set to ϵ grasp.