Learning to Manipulate Unknown Objects in Clutter by Reinforcement
Authors: Abdeslam Boularias, James Bagnell, Anthony Stentz
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performance of our system is assessed on a robot with real-world objects. ... We performed extensive tests of the presented system using a WAM robotic arm equipped with a Barrett hand and a time-of-flight camera (Kinect). |
| Researcher Affiliation | Academia | Abdeslam Boularias and J. Andrew Bagnell and Anthony Stentz The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 USA {abdeslam, dbagnell, tony}@andrew.cmu.edu |
| Pseudocode | No | The paper includes a 'System Overview' diagram (Figure 1) which illustrates the work-flow, but it does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states: 'For transparency, unedited videos of all the experiments have been uploaded to http://goo.gl/ze1Sqq '. This link is for videos, not source code. No explicit statement about code availability is made. |
| Open Datasets | No | The paper describes experiments with 'real-world objects' and collected data ('Given data set Dt = {(si, ai, ri, si+1)|i [0, t[} of observed states, executed actions, and received rewards up to current time t'), but it does not mention using any publicly available dataset nor does it provide access information for the collected data. |
| Dataset Splits | Yes | Finally, the kernel bandwidths are tuned in a leave-one-sequence-out cross-validation (Section 7). |
| Hardware Specification | No | The paper mentions the robotic setup: 'We performed extensive tests of the presented system using a WAM robotic arm equipped with a Barrett hand and a time-of-flight camera (Kinect).' This describes the robotic components but does not specify the computing hardware (CPU, GPU models, memory, etc.) used to run the experiments and computations. |
| Software Dependencies | No | The paper mentions specific algorithms/libraries like 'The CHOMP algorithm (Ratliff et al. 2009)' and 'a library of compliant hand motions with force-feedback (Kazemi et al. 2012)', as well as 'k-means', 'Mean-Shift', and 'spectral clustering'. However, no specific version numbers for any of these software components or libraries are provided. |
| Experiment Setup | Yes | γ is set to 0.5 in our experiments. ... α is a constant, set to 0.1 in all our experiments. ... ϵgrasp and ϵpush are hyper-parameters that cannot be manually tuned... We set ϵpush to 0 and search for ϵ grasp {ξgrasp/2n}, n = 0, . . . , 10, that has the lowest average Bellman error. The best threshold is further tuned by performing a grid-search in the interval [ϵ grasp, 2ϵ grasp]. ϵ push is obtained using a similar approach, with ϵgrasp set to ϵ grasp. |