From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
Authors: Zichen Jeff Cui, Yibin Wang, Nur Muhammad Mahi Shafiullah, Lerrel Pinto
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate C-Be T on three simulated benchmarks (visual self-driving in CARLA (Dosovitskiy et al., 2017), multi-modal block pushing (Florence et al., 2021), and simulated kitchen (Gupta et al., 2019)), and on a real Franka robot trained with play data collected by human volunteers. |
| Researcher Affiliation | Academia | Zichen Jeff Cui Yibin Wang Nur Muhammad (Mahi) Shafiullah Lerrel Pinto New York University |
| Pseudocode | No | The paper describes the model architecture and training objective in text and diagrams but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states, 'In our work, we base our C-Be T implementation off of the official repo published at https: //github.com/notmahi/bet.' This indicates they built upon existing code, but does not explicitly state that the specific implementation of C-Be T described in this paper is openly released or provide a link to it. |
| Open Datasets | Yes | We experimentally evaluate C-Be T on three simulated benchmarks (visual self-driving in CARLA (Dosovitskiy et al., 2017), multi-modal block pushing (Florence et al., 2021), and simulated kitchen (Gupta et al., 2019)). |
| Dataset Splits | No | The paper describes the datasets used and the training process, but does not explicitly provide specific train/validation/test dataset splits by percentage, absolute numbers, or reference predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper mentions using a 'Franka Emika Panda robot' for real-world experiments, but does not specify the computing hardware (e.g., GPU/CPU models, memory, or cloud instances) used for training or running the models. |
| Software Dependencies | No | The paper mentions various software components like Behavior Transformers, Min GPT, ResNet-18, and Adam optimizer, but does not provide specific version numbers for any software dependencies (e.g., deep learning frameworks like PyTorch/TensorFlow, or Python versions). |
| Experiment Setup | Yes | The paper provides a detailed 'HYPERPARAMETERS LIST:' in Section C.2, including Table 6 ('Environment-dependent hyperparameters in Be T') and Table 7 ('Shared hyperparameters for Be T training'), which specify values for layers, attention heads, embedding width, dropout probability, context size, training epochs, batch size, number of bins, future conditional frames, optimizer (Adam), learning rate, weight decay, betas, and gradient clip norm. |