From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data

Authors: Zichen Jeff Cui, Yibin Wang, Nur Muhammad Mahi Shafiullah, Lerrel Pinto

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate C-Be T on three simulated benchmarks (visual self-driving in CARLA (Dosovitskiy et al., 2017), multi-modal block pushing (Florence et al., 2021), and simulated kitchen (Gupta et al., 2019)), and on a real Franka robot trained with play data collected by human volunteers.
Researcher Affiliation Academia Zichen Jeff Cui Yibin Wang Nur Muhammad (Mahi) Shafiullah Lerrel Pinto New York University
Pseudocode No The paper describes the model architecture and training objective in text and diagrams but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper states, 'In our work, we base our C-Be T implementation off of the official repo published at https: //github.com/notmahi/bet.' This indicates they built upon existing code, but does not explicitly state that the specific implementation of C-Be T described in this paper is openly released or provide a link to it.
Open Datasets Yes We experimentally evaluate C-Be T on three simulated benchmarks (visual self-driving in CARLA (Dosovitskiy et al., 2017), multi-modal block pushing (Florence et al., 2021), and simulated kitchen (Gupta et al., 2019)).
Dataset Splits No The paper describes the datasets used and the training process, but does not explicitly provide specific train/validation/test dataset splits by percentage, absolute numbers, or reference predefined splits with citations for reproducibility.
Hardware Specification No The paper mentions using a 'Franka Emika Panda robot' for real-world experiments, but does not specify the computing hardware (e.g., GPU/CPU models, memory, or cloud instances) used for training or running the models.
Software Dependencies No The paper mentions various software components like Behavior Transformers, Min GPT, ResNet-18, and Adam optimizer, but does not provide specific version numbers for any software dependencies (e.g., deep learning frameworks like PyTorch/TensorFlow, or Python versions).
Experiment Setup Yes The paper provides a detailed 'HYPERPARAMETERS LIST:' in Section C.2, including Table 6 ('Environment-dependent hyperparameters in Be T') and Table 7 ('Shared hyperparameters for Be T training'), which specify values for layers, attention heads, embedding width, dropout probability, context size, training epochs, batch size, number of bins, future conditional frames, optimizer (Adam), learning rate, weight decay, betas, and gradient clip norm.