Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

Authors: Albert Yu, Ray Mooney

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTS
Researcher Affiliation Academia Albert Yu UT Austin albertyu@utexas.edu Raymond J. Mooney UT Austin mooney@utexas.edu
Pseudocode Yes Algorithm 1 De L-Ta Co: Training
Open Source Code Yes We link to our open-sourced codebase on our project website, https://deltaco-robot.github.io.
Open Datasets Yes We develop a Pybullet (Coumans & Bai, 2007-2022) simulation environment with a Widow X 250 robot arm, 32 possible objects of diverse colors and shapes for manipulation, and 2 different containers. Using a scripted policy (details in Appendix C), we collect roughly 130 successful demonstrations for each training task, and a single successful demonstration for each test task. All demonstrations are 30 timesteps long. Depending on our experimental scenario (see Section 5.2), we train on 65% to 80% of the 300 tasks, so our training buffer contains roughly 26,000-31,000 trajectories. Appendix A provides the full list of our 300 tasks, instructions, and objects, as well as train and test task splits.
Dataset Splits Yes We define a set of n tasks {Ti}n i=1 and split them into training tasks U and test tasks V , where (U, V ) is a bipartition of {Ti}n i=1. During evaluation, we assume access to a buffer Dval of trajectories for only the tasks in V and their associated natural language descriptions. Appendix A provides the full list of our 300 tasks, instructions, and objects, as well as train and test task splits. Scenario A (novel objects, colors, and shapes) trains on all gray tasks and tests on yellow , blue , and green tasks. Scenario B (novel colors and shapes) trains on all gray and yellow tasks and tests on blue and green tasks.
Hardware Specification No The paper mentions a 'Pybullet simulation environment' and the simulated 'Widow X 250 robot arm', but does not specify any actual hardware (like GPUs, CPUs, or memory) used for running the experiments or training the models.
Software Dependencies No The paper mentions software like 'Distil BERT' and 'Pybullet (Coumans & Bai, 2007-2022)', and 'Res Net-18' but does not provide specific version numbers for these or other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow).
Experiment Setup Yes Table 5: Policy π hyperparameters, Table 6: fdemo CNN hyperparameters, Table 7: Imitation learning hyperparameters. These tables provide specific values for learning rate, batch size, number of tasks per batch, task encoder weight, contrastive learning temperature, input/output sizes, kernel sizes, strides, activation functions, and image augmentation details.