Behavior Generation with Latent Actions

Authors: Seungjae Lee, Yibin Wang, Haritheja Etukuru, H. Jin Kim, Nur Muhammad Mahi Shafiullah, Lerrel Pinto

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Across seven environments including simulated manipulation, autonomous driving, and robotics, VQ-Be T improves on state-of-the-art models such as Be T and Diffusion Policies. Importantly, we demonstrate VQ-Be T s improved ability to capture behavior modes while accelerating inference speed 5 over Diffusion Policies. Through extensive experiments across eight benchmark environments, we present the following experimental insights:
Researcher Affiliation Academia 1New York University 2Department of Aerospace Engineering, Seoul National University 3Artificial Intelligence Institute of SNU
Pseudocode No The paper describes the VQ-Be T model and its two stages (Action discretization phase and VQ-Be T learning phase) conceptually and with figures, but does not include a formal pseudocode or algorithm block.
Open Source Code Yes Code is avaliable at https: //github.com/jay LEE0301/vq_bet_official
Open Datasets Yes Franka Kitchen: We use the Franka Kitchen robotic manipulation environment introduced in (Gupta et al., 2019)... Push T: We adopt the Push T environment introduced in (Chi et al., 2023)... nu Scenes self-driving: ...we use the nu Scenes (Caesar et al., 2020) self-driving environment as a test setup.
Dataset Splits Yes using 80% of them for training and 20% for validation
Hardware Specification Yes RTX A4000 GPU 4-Core Intel CPU
Software Dependencies No The paper mentions various models and frameworks like GPT-like transformer architecture, Residual VQ-VAE, and specific environments, but does not provide explicit version numbers for software dependencies or libraries (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes Table 13. Hyperparameters for VQ-Be T