Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution

Authors: Bozhou Zhang, Nan Song, jingyu li, Xiatian Zhu, Jiankang Deng, Li Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the NAVSIM and nu Scenes benchmarks show that Seer Drive significantly outperforms existing state-of-the-art methods.
Researcher Affiliation	Academia	1School of Data Science, Fudan University 2Shanghai Innovation Institute 3University of Surrey 4Imperial College London
Pseudocode	No	The paper describes its methodology using text and figures, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	https://github.com/Logos Robotics Group/Seer Drive
Open Datasets	Yes	We conduct experiments on two large-scale real-world autonomous driving datasets: NAVSIM [19] and nu Scenes [17].
Dataset Splits	Yes	nu Scenes includes 1,000 scenes with 6-camera and Li DAR data at 2 Hz. We follow the standard 700/150 train/validation split and evaluate planning in an open-loop setting.
Hardware Specification	Yes	The model is trained on 8 NVIDIA Ge Force RTX 3090 GPUs.
Software Dependencies	No	The paper mentions 'Adam W [48]' for optimization and specific backbone networks like 'Res Net34' and 'Res Net50', but does not provide specific version numbers for any key software components or libraries.
Experiment Setup	Yes	For NAVSIM, the batch size is 16 per GPU, with 30 training epochs and a total training time of around 5 hours. For nu Scenes, the batch size is 1 per GPU, with 12 epochs and a training time of about 12 hours. The learning rate is set to 2 10 4 for NAVSIM and 1 10 4 for nu Scenes, both optimized using Adam W [48]. The loss balancing factors are set to λ1 = 10, λ2 = 0.1, and λ3 = 1 for NAVSIM, and all set to 1 for nu Scenes.