Behavior Transformers: Cloning $k$ modes with one stone
Authors: Nur Muhammad Shafiullah, Zichen Cui, Ariuntuya (Arty) Altanzaya, Lerrel Pinto
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate Be T on a variety of robotic manipulation and self-driving behavior datasets. We show that Be T significantly improves over prior state-of-the-art work on solving demonstrated tasks while capturing the major modes present in the pre-collected datasets. Finally, through an extensive ablation study, we analyze the importance of every crucial component in Be T. |
| Researcher Affiliation | Academia | Nur Muhammad (Mahi) Shafillah Zichen Jeff Cui Ariuntuya Altanzaya Lerrel Pinto New York University Corresponding author, email: mahi@cs.nyu.edu |
| Pseudocode | No | The paper includes diagrams illustrating the architecture and process, but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All of our datasets, code, and trained models will be made publicly available. |
| Open Datasets | Yes | CARLA [21] uses the Unreal Engine to provide a simulated driving environment in a visually realistic landscape. [21] A. Dosovitskiy, G. Ros, F. Codevilla, A. M. Lopez, and V. Koltun. Carla: An open urban driving simulator. In Conference on robot learning, pages 1-16. PMLR, 2017. |
| Dataset Splits | No | The paper does not provide specific percentages or counts for training/validation/test splits, nor does it reference predefined splits with citations for dataset partitioning. |
| Hardware Specification | Yes | Our models contain on the order of 10^4-10^6 parameters, and even with a small batch size trains within an hour for our largest datasets (Block push) on a single desktop GPU. |
| Software Dependencies | No | The paper mentions various models and techniques but does not provide specific software environment details or library versions (e.g., Python version, PyTorch version). |
| Experiment Setup | No | The paper describes the model architecture and loss functions, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings in the main text. |