Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving
Authors: Yanping Fu, Xinyuan Liu, Tianyu Li, Yike Ma, Yucheng Zhang, Feng Dai
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the Open Lane-V2 benchmark demonstrate that Topo Point achieves state-of-the-art performance in topology reasoning (48.8 on OLS). Additionally, we propose DETp to evaluate endpoint detection, under which our method significantly outperforms existing approaches (52.6 v.s. 45.2 on DETp). |
| Researcher Affiliation | Collaboration | 1Institute of Computing Technology, Chinese Academy of Science; 2University of Chinese Academy of Sciences; 3Shanghai AI Lab; 4Shanghai Innovation Institute EMAIL |
| Pseudocode | Yes | Algorithm 1: Point-Lane Geometry Matching Algorithm |
| Open Source Code | Yes | The code is released at https://github.com/Franpin/Topo Point. |
| Open Datasets | Yes | We evaluate Topo Point on the large-scale topology reasoning benchmark Open Lane V2[17], which is constructed based on Argoverse2[46] and nu Scenes[47]. |
| Dataset Splits | No | Open Lane-V2 is divided into two subsets: subset_A and subset_B, each containing 1,000 scenes captured at 2 Hz with multi-view images and corresponding annotations. Both subsets include annotations for lane centerlines, traffic elements, lane-lane topology, and lane-traffic topology. Notably, subset_A provides seven camera views as input, while subset_B includes six views. |
| Hardware Specification | Yes | All experiments are conducted for 24 epochs on 8 Tesla V100 GPUs with a batch size of 8. |
| Software Dependencies | No | Topo Point is trained using the Adam W optimizer with a cosine annealing learning rate schedule, starting at 2.0 10 4 with a weight decay of 0.01. A pretrained Res Net-50 is adopted as the backbone, and a Feature Pyramid Network is used as the neck to extract multi-scale features. |
| Experiment Setup | Yes | The multi-view images have a resolution of 2048 1550 pixels, with the front view specifically cropped and padded to match 2048 1550. Notably, all multi-view inputs are downsampled by a factor of 0.5 before being fed into the backbone, except for the front view, which is directly processed at the original resolution. A pretrained Res Net-50 is adopted as the backbone, and a Feature Pyramid Network is used as the neck to extract multi-scale features. The hidden feature dimension d is set to 256. BEV grid size is configured to 200 100. The number of traffic element query Nt, point query Np and lane query Nl are set to 100, 200 and 300, respectively. The sampled points number k of each lane is set to 11. The decoder consists of 6 layers. Following Topo Logic, the learnable parameters λ and α in the mapping function fmap are initialized to 0.2 and 2.0, respectively, λ1 and λ2 in Apl are both initialized to 1.0. The detection loss weights λt, λp, λl and are all set to 1.0, while the topology reasoning loss weights λll and λlt are both set to 5.0. In inference, the classification thresholds for filtering high-confidence predictions are both set to τp = τl = 0.3. For geometric matching, the distance threshold δ is set to 1.5 meters to determine valid point-lane associations. Topo Point is trained using the Adam W optimizer with a cosine annealing learning rate schedule, starting at 2.0 10 4 with a weight decay of 0.01. All experiments are conducted for 24 epochs on 8 Tesla V100 GPUs with a batch size of 8. |