Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Authors: Zhanyi Sun, Shuran Song

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both simulated and real-world experiments show that LPB improves both policy robustness and data efficiency, enabling reliable manipulation from limited expert data and without additional human correction or annotation.
Researcher Affiliation Academia Zhanyi Sun Shuran Song Stanford University project-latentpolicybarrier.github.io
Pseudocode Yes Algorithm 1 Latent Policy Barrier (Inference time)
Open Source Code Yes Code and data for reproducing the result will be made publicly available.
Open Datasets Yes For simulation experiments, the demonstration data is taken from public datasets ([41, 12]).
Dataset Splits Yes For each Robomimic task (Square, Tool-Hang, Transport) and for Push-T, we keep 20% of the original expert demonstrations. For each task, a base diffusion policy is trained on these demonstrations. For Libero10, we use all 50 provided demonstrations for each of the ten tasks to train a language-conditioned, multi-task base diffusion policy.
Hardware Specification Yes All simulated experiments are run on a single NVIDIA L40S GPU (46 GB VRAM). ... The dynamics model is trained in parallel on six NVIDIA L40S GPUs and converges in approximately 36 h.
Software Dependencies No The paper mentions software components like "Diffusion Policy", "Res Net-18", "U-Net", "Vision Transformer (Vi T)", and low-level controllers from GitHub, but does not provide specific version numbers for these software dependencies as required.
Experiment Setup Yes Task-specific and shared hyperparameters are provided in Table 5 and Table 6, respectively. ... Training hyperparameters for fϕ are provided in Table 8. ... The OOD threshold τ is chosen empirically by rolling out the final policy checkpoint, while the guidance scale η is selected via a grid search. Both η and τ for each task are listed in Table 9.