BAKU: An Efficient Transformer for Multi-Task Policy Learning

Authors: Siddhant Haldar, Zhuoran Peng, Lerrel Pinto

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on 129 simulated tasks across LIBERO, Meta-World suite, and the Deepmind Control suite exhibit an overall 18% absolute improvement over RT-1 and MT-ACT, with a 36% improvement on the harder LIBERO benchmark. On 30 real-world manipulation tasks, given an average of just 17 demonstrations per task, BAKU achieves a 91% success rate.
Researcher Affiliation Academia Siddhant Haldar Zhuoran Peng Lerrel Pinto New York University
Pseudocode No The paper describes architectural components and algorithmic ideas but does not contain a structured pseudocode or algorithm block.
Open Source Code Yes All of our datasets, and training and evaluation code will be made publicly available. Videos of our trained policies can be seen here: baku-robot.github.io.
Open Datasets Yes We experiment with 90 manipulation tasks from the LIBERO-90 benchmark [34], 30 manipulation tasks from Meta-World suite [76], and 9 locomotion tasks from Deep Mind Control Suite (DMC) [67].
Dataset Splits No The paper discusses training and test phases but does not explicitly provide details about a validation dataset split, its size, or how it was used.
Hardware Specification Yes Training time Below we provide details about the time required to train BAKU on a single NVIDIA RTX A4000 GPU.
Software Dependencies Yes Transformer architecture min GPT [29] (with 8 layers and 4 heads)
Experiment Setup Yes The complete list of hyperparameters is provided in Table 4. For RT-1 [6], we use our implementation with an RT-1 action head that discretizes the continuous action into discrete bins uniformly. For MT-ACT [5], we use the open-source implementation with the default hyperparameters. We vary the action chunk length for MT-ACT for different benchmarks, the values for which have been provided in Table 4.