Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Behavior Transformers: Cloning $k$ modes with one stone

Authors: Nur Muhammad Shafiullah, Zichen Cui, Ariuntuya (Arty) Altanzaya, Lerrel Pinto

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally evaluate Be T on a variety of robotic manipulation and self-driving behavior datasets. We show that Be T significantly improves over prior state-of-the-art work on solving demonstrated tasks while capturing the major modes present in the pre-collected datasets. Finally, through an extensive ablation study, we analyze the importance of every crucial component in Be T.
Researcher Affiliation	Academia	Nur Muhammad (Mahi) Shafillah Zichen Jeff Cui Ariuntuya Altanzaya Lerrel Pinto New York University Corresponding author, email: EMAIL
Pseudocode	No	The paper includes diagrams illustrating the architecture and process, but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	All of our datasets, code, and trained models will be made publicly available.
Open Datasets	Yes	CARLA [21] uses the Unreal Engine to provide a simulated driving environment in a visually realistic landscape. [21] A. Dosovitskiy, G. Ros, F. Codevilla, A. M. Lopez, and V. Koltun. Carla: An open urban driving simulator. In Conference on robot learning, pages 1-16. PMLR, 2017.
Dataset Splits	No	The paper does not provide specific percentages or counts for training/validation/test splits, nor does it reference predefined splits with citations for dataset partitioning.
Hardware Specification	Yes	Our models contain on the order of 10^4-10^6 parameters, and even with a small batch size trains within an hour for our largest datasets (Block push) on a single desktop GPU.
Software Dependencies	No	The paper mentions various models and techniques but does not provide specific software environment details or library versions (e.g., Python version, PyTorch version).
Experiment Setup	No	The paper describes the model architecture and loss functions, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings in the main text.