Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Authors: Edwin Zhang, Yujie Lu, Shinda Huang, William Yang Wang, Amy Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comparing LCD with other state-of-the-art models on the CALVIN language robotics benchmark finds that LCD outperforms other SOTA methods in multi-task success rates, whilst improving inference speed over other comparable diffusion models by 3.3x~15x. We evaluate on the CALVIN, and CLEVR-Robot benchmark (Mees et al., 2022b; Jiang et al., 2019), both challenging multi-task, long-horizon benchmarks.
Researcher Affiliation Academia Edwin Zhang Harvard, Founding Yujie Lu, Shinda Huang, William Wang University of California, Santa Barbara Amy Zhang UT Austin
Pseudocode Yes Algorithm 1 Hierarchical Diffusion Policy Training
Open Source Code Yes We release our code and models at https://github.com/ezhang7423/ language-control-diffusion/.
Open Datasets Yes We evaluate on the CALVIN, and CLEVR-Robot benchmark (Mees et al., 2022b; Jiang et al., 2019)... In order to first train our LLP, we use the original training dataset provided with the CALVN benchmark... http://calvin.cs.uni-freiburg.de/
Dataset Splits No The paper does not explicitly state specific training, validation, and test splits (e.g., percentages or exact counts). It refers to using the "original training dataset" and evaluates on benchmarks, but without detailing how data was partitioned for validation specifically.
Hardware Specification Yes Baselines are run with either 8 Titan RTX or 8 A10 GPUs following the original author guidelines, whilst our experiments are run with a single RTX or A10. Hardware Type: NVIDIA Titan RTX, NVIDIA A10
Software Dependencies No The paper mentions "Software Pytorch" but does not specify a version number for PyTorch or any other software component, which is necessary for reproducibility.
Experiment Setup Yes Table 6: Hyperparameters for our methods in Table 1. Model Module Hyperparameter Value HULC Trainer Max Epochs 30... Optimizer Adam Learning Rate 2e-4... Gaussian Diffusion Action Dimension 32... Batch Size 512 Learning Rate 2e-4 Train Steps 250k... Table 7: Hyperparameters for the Transformer in Table 5. Number of gradient steps 100 K Mini-batch size 512 Transformer hidden dim 4096 Transformer layers 4...