Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Authors: Albert Yu, Ray Mooney
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS |
| Researcher Affiliation | Academia | Albert Yu UT Austin albertyu@utexas.edu Raymond J. Mooney UT Austin mooney@utexas.edu |
| Pseudocode | Yes | Algorithm 1 De L-Ta Co: Training |
| Open Source Code | Yes | We link to our open-sourced codebase on our project website, https://deltaco-robot.github.io. |
| Open Datasets | Yes | We develop a Pybullet (Coumans & Bai, 2007-2022) simulation environment with a Widow X 250 robot arm, 32 possible objects of diverse colors and shapes for manipulation, and 2 different containers. Using a scripted policy (details in Appendix C), we collect roughly 130 successful demonstrations for each training task, and a single successful demonstration for each test task. All demonstrations are 30 timesteps long. Depending on our experimental scenario (see Section 5.2), we train on 65% to 80% of the 300 tasks, so our training buffer contains roughly 26,000-31,000 trajectories. Appendix A provides the full list of our 300 tasks, instructions, and objects, as well as train and test task splits. |
| Dataset Splits | Yes | We define a set of n tasks {Ti}n i=1 and split them into training tasks U and test tasks V , where (U, V ) is a bipartition of {Ti}n i=1. During evaluation, we assume access to a buffer Dval of trajectories for only the tasks in V and their associated natural language descriptions. Appendix A provides the full list of our 300 tasks, instructions, and objects, as well as train and test task splits. Scenario A (novel objects, colors, and shapes) trains on all gray tasks and tests on yellow , blue , and green tasks. Scenario B (novel colors and shapes) trains on all gray and yellow tasks and tests on blue and green tasks. |
| Hardware Specification | No | The paper mentions a 'Pybullet simulation environment' and the simulated 'Widow X 250 robot arm', but does not specify any actual hardware (like GPUs, CPUs, or memory) used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions software like 'Distil BERT' and 'Pybullet (Coumans & Bai, 2007-2022)', and 'Res Net-18' but does not provide specific version numbers for these or other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | Table 5: Policy π hyperparameters, Table 6: fdemo CNN hyperparameters, Table 7: Imitation learning hyperparameters. These tables provide specific values for learning rate, batch size, number of tasks per batch, task encoder weight, contrastive learning temperature, input/output sizes, kernel sizes, strides, activation functions, and image augmentation details. |