Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling

Authors: Tianyi Tan, Yinan Zheng, Ruiming Liang, Zexu Wang, Kexin ZHENG, Jinliang Zheng, Jianxiong Li, Xianyuan Zhan, Jingjing Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the large-scale nu Plan dataset and challenging interactive inter Plan dataset demonstrate that Flow Planner achieves state-of-the-art performance among learning-based approaches while effectively modeling interactive behaviors in complex driving scenarios.
Researcher Affiliation	Academia	1 Institute for AI Industry Research (AIR), Tsinghua University 2 Institute of Automation, Chinese Academy of Sciences 3 The Chinese University of Hong Kong EMAIL EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and a high-level architectural diagram (Figure 1), but it does not include any explicit pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Official implementation can be found in https://github.com/Diffusion AD/Flow-Planner.
Open Datasets	Yes	Experimental results on the large-scale nu Plan dataset and challenging interactive inter Plan dataset demonstrate that Flow Planner achieves state-of-the-art performance among learning-based approaches while effectively modeling interactive behaviors in complex driving scenarios. In this study, our model is trained on nu Plan [19] featuring a large-scale real-world driving dataset... In addition, we evaluate our model using the full-scale inter Plan benchmark [20].
Dataset Splits	Yes	The model is trained on the 1M training split, following [57]. For evaluation, the model is tested on both nu Plan and inter Plan, and we mainly focus on the closed-loop performance of the planners, where an LQR controller is used for simulation. We test the model in both non-reactive and reactive settings using the following three benchmarks in nu Plan: (1) Val14 [13], a validation dataset with 1118 scenarios in total; (2) Test14-random [9]: over 200 randomly selected scenarios from the scenario types assigned by the nu Plan Planning Challenge; and (3) Test14-hard [9]: a collection of the worst-performing scenarios by rule-based PDM [13], comprising 272 scenarios.
Hardware Specification	Yes	The training is conducted on 8 NVIDIA A6000 GPUs, using the 1M training data split from nu Plan.
Software Dependencies	No	The paper mentions using 'Adam W optimizer' and 'second-order midpoint method' for the flow ODE solver, and 'LQR controller' for simulation. However, it does not provide specific version numbers for any software libraries, programming languages, or environments.
Experiment Setup	Yes	The training is conducted on 8 NVIDIA A6000 GPUs... The model is trained for over 200 epochs with a batch size of 2048. We used Adam W optimizer for training, and the learning rate is set to be 5 ˆ 10 4. In addition, we used exponential moving average (EMA) to stablize the training process, with 0.999 weight decay. During inference, we used a simple midpoint solver to solve the flow ODE, with only four steps of ODE simulation. Table 7 details hyperparameters such as number of neighboring vehicles (32), number of past timestamps (21), dimension of neighboring vehicles (11), number of lanes (70), number of points per polyline (20), dimension of lanes vehicles (12), number of navigation lanes (25), number of encoder blocks (3), number of decoder blocks (4), dimension of encoder hidden layer (192), dimension of decoder hidden layer (256), number of multi-head (8), length of trajectory segment (20), and length of trajectory overlap (10).