Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generative Planning with 3D-Vision Language Pre-training for End-to-End Autonomous Driving

Authors: Tengpeng Li, Hanli Wang, Xianfei Li, Wenlong Liao, Tao He, Pai Peng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the challenging nu Scenes dataset demonstrate that the proposed scheme achieves excellent performances compared with state-of-the-art methods. Besides, the proposed GPVL presents strong generalization ability and real-time potential when handling high-level commands in various scenarios. ... Substantial experiments are conducted on the complex public nu Scenes (Caesar et al. 2020) dataset... The ablation study in Table 3 systematically investigates the contributions of the key components of GPVL on the nu Scenes dataset.
Researcher Affiliation	Collaboration	1College of Electronic and Information Engineering, Tongji University, Shanghai, China 2School of Computer Science and Technology, Tongji University, Shanghai, China 3COWAROBOT, China 4School of Electronic Engineering, University of South China, Hunan, China EMAIL, pengpai EMAIL
Pseudocode	No	The paper describes methods and formulations but does not present any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using pre-trained models like BEVformer, BLIP, and BERT, but does not provide any statement or link for the open-source code of their proposed GPVL methodology.
Open Datasets	Yes	Substantial experiments are conducted on the complex public nu Scenes (Caesar et al. 2020) dataset, which comprises 1,000 trafﬁc scenarios, and the duration of each video is around 20 seconds. This dataset offers over 1.4 million 3D bounding boxes across 23 different object categories.
Dataset Splits	Yes	Experiments on the challenging nu Scenes dataset demonstrate that the proposed scheme achieves excellent performances compared with state-of-the-art methods. ... We train and test the models on datasets constructed from two different urban environments (i.e., Boston and Singapore). Speciﬁcally, two groups of experiments are introduced: (1) training on Boston and testing on Singapore, (2) training on Singapore and testing on Boston. ... In the nu Scenes dataset, 87.7% training and 88.2% validation samples consist of simple go straight scenes.
Hardware Specification	Yes	The proposed model is trained on Py Torch framework with 8 NVIDIA RTX A6000 cards.
Software Dependencies	No	The paper mentions using PyTorch framework, BERT structure, and Adam W optimizer, but no specific version numbers are provided for these software components.
Experiment Setup	Yes	The proposed model aims to predict the trajectory for the future 3 seconds. The input image size is 1280 720. GPVL utilizes Res Net50 (He et al. 2016) to extract the multi-view image features. The numbers of BEV queries, bounding boxes and map points are 200 200, 200 and 100 20, respectively. The feature dimension and hidden size are 768 and 512, respectively. The model utilizes the Adam W (Loshchilov and Hutter 2017) optimizer and weight decay 0.01 in the training process. The learning rates in three training stages are 2 10 4, 1 10 4 and 5 10 6, respectively. The BERT (Devlin et al. 2019) structure is used by 3D-vision language pre-training and cross-modal language model. As for testing, the size of greedy search is set to 1 to generate the trajectory caption.