Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller
Authors: Pratyusha Sharma, Deepak Pathak, Abhinav Gupta
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show results on a real robotic platform using Baxter for the manipulation tasks of pouring and placing objects in a box. We first systematically evaluate the quality of both the high-level and low-level modules individually given perfect information on held-out test examples of human video and robot trajectories. We then ablate the generalization properties of these modules across the same task with different scenarios and different tasks with different scenarios. Finally, we deploy the complete system on the Baxter robot for performing tasks with novel objects and demonstrations. |
| Researcher Affiliation | Collaboration | Pratyusha Sharma MIT Deepak Pathak Facebook AI Research Abhinav Gupta CMU |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions a project video link but does not explicitly state that the source code for their methodology is available or provide a repository link for it. |
| Open Datasets | Yes | We use the MIME dataset [25] of human demonstrations to train our decoupled hierarchical controllers. The dataset is collected using a Baxter robot and contains pairs of 8260 human-kinesthetic robot demonstrations spanned across 20 tasks. |
| Dataset Splits | Yes | For the pouring task, we train on 230 demonstrations, validate on 29, and test on 30 demonstrations. For the models trained on multiple tasks, 6632 demonstrations were used for training, 829 for validation, and 829 for test. |
| Hardware Specification | Yes | We show results on a real robotic platform using Baxter for the manipulation tasks of pouring and placing objects in a box. |
| Software Dependencies | No | The paper mentions using a U-Net based architecture and ResNet-18 model, but does not provide specific software versions for libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | We sub-sampled the trajectories (both images and joint angle states) to a fixed length of 200 time steps for training our models. For training low-level inverse model, we perform regression the action space of robot at which is a fourteen dimensional joint angle state [θ1, θ2, θ3..., θ14]. All the training and implementation details related to our hierarchical controllers are provided in the supplementary. |