Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations
Authors: Xin Liu, Haoran Li, Dongbin Zhao
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on a set of challenging visual control tasks, including 16 discrete control tasks from the Procgen benchmark [30] and 12 continuous control tasks from the Deepmind Control suite (DMControl) [31] and Metaworld [32]. |
| Researcher Affiliation | Academia | Xin Liu1,2, Haoran Li1,2, , Dongbin Zhao1,2 1State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences EMAIL |
| Pseudocode | Yes | We provide a diagram in Figure 2 and pseudocode in Appendix A. ... A.1 Pseudo Code Algorithm 1 The pseudo code of the proposed BCV-LR. |
| Open Source Code | Yes | We provide the implementation of BCV-LR at https://github.com/liuxin0824/BCV-LR. |
| Open Datasets | Yes | We conduct extensive experiments on a set of challenging visual control tasks, including 16 discrete control tasks from the Procgen benchmark [30] and 12 continuous control tasks from the Deepmind Control suite (DMControl) [31] and Metaworld [32]. The expert video dataset containing 8M steps is generated by well-trained RL agents, provided by [21]. |
| Dataset Splits | No | The paper mentions using an "expert video dataset containing 8M steps" and interacting with environments for a limited number of "environmental steps" (e.g., 100k, 50k, 20k), but it does not specify explicit training/test/validation splits for these datasets or how they are partitioned in a formal manner beyond the online learning context. |
| Hardware Specification | Yes | The experiments of BCV-LR are conducted using V100 or A800 GPUs, and the complete workflow for each task can be finished within five hours on a single GPU. |
| Software Dependencies | No | The paper provides detailed hyper-parameter tables but does not explicitly list specific software dependencies (e.g., programming language versions, library names with version numbers like PyTorch, TensorFlow, or scikit-learn). |
| Experiment Setup | Yes | Appendix C.2 provides detailed hyper-parameter settings in Table 11 ("Default hyper-parameter settings in discrete Procgen") and Table 12 ("Default hyper-parameter settings in DMControl"). |