Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Behavior Generation with Latent Actions
Authors: Seungjae Lee, Yibin Wang, Haritheja Etukuru, H. Jin Kim, Nur Muhammad Mahi Shafiullah, Lerrel Pinto
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Across seven environments including simulated manipulation, autonomous driving, and robotics, VQ-Be T improves on state-of-the-art models such as Be T and Diffusion Policies. Importantly, we demonstrate VQ-Be T s improved ability to capture behavior modes while accelerating inference speed 5 over Diffusion Policies. Through extensive experiments across eight benchmark environments, we present the following experimental insights: |
| Researcher Affiliation | Academia | 1New York University 2Department of Aerospace Engineering, Seoul National University 3Artificial Intelligence Institute of SNU |
| Pseudocode | No | The paper describes the VQ-Be T model and its two stages (Action discretization phase and VQ-Be T learning phase) conceptually and with figures, but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Code is avaliable at https: //github.com/jay LEE0301/vq_bet_official |
| Open Datasets | Yes | Franka Kitchen: We use the Franka Kitchen robotic manipulation environment introduced in (Gupta et al., 2019)... Push T: We adopt the Push T environment introduced in (Chi et al., 2023)... nu Scenes self-driving: ...we use the nu Scenes (Caesar et al., 2020) self-driving environment as a test setup. |
| Dataset Splits | Yes | using 80% of them for training and 20% for validation |
| Hardware Specification | Yes | RTX A4000 GPU 4-Core Intel CPU |
| Software Dependencies | No | The paper mentions various models and frameworks like GPT-like transformer architecture, Residual VQ-VAE, and specific environments, but does not provide explicit version numbers for software dependencies or libraries (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | Table 13. Hyperparameters for VQ-Be T |