Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy
Authors: Rui Zhao, Yuze Fan, Ziguo Chen, Fei Gao, Zhenhai Gao
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Diff E2E achieves state-of-the-art performance on both CARLA closed-loop benchmarks and NAVSIM evaluations. The proposed unified framework that integrates diffusion and explicit strategies provides a generalizable paradigm for hybrid action representation and shows substantial potential for extension to broader domains, including embodied intelligence. |
| Researcher Affiliation | Academia | 1College of Automotive Engineering, Jilin University 2National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University |
| Pseudocode | No | The paper describes the methodology in prose and mathematical equations within Section 3 'Methodology', but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Justification: The code and model checkpoints will be released soon. |
| Open Datasets | Yes | This research is primarily evaluated using the CARLA simulator closed-loop benchmark [14] and the NAVSIM non-reactive simulation benchmark [12]. |
| Dataset Splits | Yes | We adopt CARLA Longest6, CARLA Town05 Long, and CARLA Town05 Short as evaluation benchmarks [9, 38], using the official Driving Score (DS), Route Completion (RC), and Infraction Score (IS) as metrics. Detailed implementation details and baseline descriptions are provided in Appendix B.1. This study builds a model training framework based on NAVSIM s navtrain dataset. Unlike the CARLA setup, we adopt Vov Net V2-99 [30] as the feature extraction backbone network in NAVSIM. The Predictive Driver Model Score (PDMS) is used as a comprehensive metric, combining key driving dimensions via weighted integration: No at-fault Collision (NC), Drivable Area Compliance (DAC), Time-To-Collision (TTC), Comfort (C), and Ego Progress (EP). Detailed implementation details and baseline descriptions can be found in Appendix B.2. |
| Hardware Specification | Yes | All experiments are conducted on four NVIDIA 3090 GPUs. |
| Software Dependencies | No | The paper mentions software components like 'Reg Net Y-3.2GF' and 'Vov Net V2-99' as encoders, which are models typically implemented using deep learning frameworks (e.g., PyTorch, TensorFlow), but it does not specify explicit version numbers for these frameworks or any other software libraries used for implementation. |
| Experiment Setup | Yes | The entire training is divided into two stages, each trained for 30 epochs, with an initial learning rate of 3e-4. Batch size is adapted for different stages 16 for the first stage and 256 for the second stage to accelerate the convergence of the diffusion model. The specific hyperparameter settings are shown in Table 5. |