Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Authors: Siddhant Haldar, Zhuoran Peng, Lerrel Pinto
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on 129 simulated tasks across LIBERO, Meta-World suite, and the Deepmind Control suite exhibit an overall 18% absolute improvement over RT-1 and MT-ACT, with a 36% improvement on the harder LIBERO benchmark. On 30 real-world manipulation tasks, given an average of just 17 demonstrations per task, BAKU achieves a 91% success rate. |
| Researcher Affiliation | Academia | Siddhant Haldar Zhuoran Peng Lerrel Pinto New York University |
| Pseudocode | No | The paper describes architectural components and algorithmic ideas but does not contain a structured pseudocode or algorithm block. |
| Open Source Code | Yes | All of our datasets, and training and evaluation code will be made publicly available. Videos of our trained policies can be seen here: baku-robot.github.io. |
| Open Datasets | Yes | We experiment with 90 manipulation tasks from the LIBERO-90 benchmark [34], 30 manipulation tasks from Meta-World suite [76], and 9 locomotion tasks from Deep Mind Control Suite (DMC) [67]. |
| Dataset Splits | No | The paper discusses training and test phases but does not explicitly provide details about a validation dataset split, its size, or how it was used. |
| Hardware Specification | Yes | Training time Below we provide details about the time required to train BAKU on a single NVIDIA RTX A4000 GPU. |
| Software Dependencies | Yes | Transformer architecture min GPT [29] (with 8 layers and 4 heads) |
| Experiment Setup | Yes | The complete list of hyperparameters is provided in Table 4. For RT-1 [6], we use our implementation with an RT-1 action head that discretizes the continuous action into discrete bins uniformly. For MT-ACT [5], we use the open-source implementation with the default hyperparameters. We vary the action chunk length for MT-ACT for different benchmarks, the values for which have been provided in Table 4. |