Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers

Authors: Lirui Wang, Xinlei Chen, Jialiang Zhao, Kaiming He

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to investigate the scaling behaviors of training objectives, to the extent of 52 datasets. HPTs outperform several baselines and enhance the fine-tuned policy performance by over 20% on unseen tasks in multiple simulator benchmarks and real-world settings.
Researcher Affiliation	Collaboration	Lirui Wang1 Xinlei Chen2 Jialiang Zhao1 Kaiming He1 1MIT CSAIL 2Meta, FAIR
Pseudocode	No	The paper describes the architecture and training process in detail with diagrams and text, but it does not include a dedicated section or figure labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	As an attempt to scale heterogeneous pre-training, our code and weights are open-sourced, and we hope that HPT can shed some light on learning robot representations from heterogeneous embodiments and tasks. ... 2https://github.com/liruiw/HPT and https://github.com/liruiw/lerobot
Open Datasets	Yes	We use 27 robot teleoperation datasets, including a subset of the recently public Open-X Embodiment dataset [14] as the training corpus. ... In total, we use a subset of 42 datasets in the Open-X Embodiment dataset [14], including the recent Droid [76] dataset.
Dataset Splits	Yes	We use a maximum of 1000 trajectories from each dataset and a total number of 16k trajectories, and a held-out validation dataset with a maximum 200 trajectories per data source.
Hardware Specification	Yes	The inference time during transfer on an RTX 3070 GPU is 47Hz for HPT-base and 19Hz for HPT-XL... The compute resources for these pre-training experiments range from 8 V-100s to 128 V-100s... We train with batch size 256 on a single NVIDIA RTX 2080Ti GPU for 20000 iterations.
Software Dependencies	No	The paper mentions software components like 'Res Net18' and 'T5' with citations, but does not provide specific version numbers for these or other key software dependencies like PyTorch, TensorFlow, or CUDA.
Experiment Setup	Yes	The training uses a batch size of 256 for 80k iterations... We train HPT with Adam W [47] optimizer with a weight decay ratio 0.05, and a base learning rate of 0.0002 with a cosine learning rate schedule with warmups and dropouts.