Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
Authors: Haiming Zhang, Yiyao Zhu, Wending Zhou, Xu Yan, Yingjie CAI, Bingbing Liu, Shuguang Cui, Zhen Li
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on autonomous driving benchmarks demonstrate that SQS delivers considerable performance gains across multiple query-based 3D perception tasks, notably in occupancy prediction and 3D object detection, outperforming prior state-of-the-art pre-training approaches by a significant margin (i.e., +1.3 m Io U on occupancy prediction and +1.0 NDS on 3D detection). We evaluate the effectiveness of SQS on two challenging downstream perception tasks: semantic occupancy prediction and 3D object detection. ... In Tab. 1, we present a comprehensive quantitative comparison... In this section, we conduct ablations on the semantic occupancy prediction task... |
| Researcher Affiliation | Collaboration | Haiming Zhang1,2 , Yiyao Zhu3 , Wending Zhou1,2, Xu Yan4 , Yingjie Cai4, Bingbing Liu4, Shuguang Cui2,1, Zhen Li2,1 1 FNii, Shenzhen 2 SSE, CUHK-Shenzhen 3 HKUST 4 Huawei Noah s Ark Lab {haimingzhang@link.,lizhen@}cuhk.edu.cn EMAIL EMAIL |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided. The methodology is described in the main text of Section 3. |
| Open Source Code | No | Answer: [No] Justification: We plan to release the code upon acceptance. |
| Open Datasets | Yes | We conduct experiments on the nu Scenes dataset [3], a large-scale benchmark specifically curated for autonomous driving research. ... Building upon the nu Scenes dataset, Surround Occ [57] provides the dense 3D semantic occupancy annotation tailored for the occupancy prediction task. |
| Dataset Splits | Yes | We conduct experiments on the nu Scenes dataset [3], a large-scale benchmark specifically curated for autonomous driving research. The dataset comprises 700 training scenes, 150 validation scenes, and 150 test scenes. |
| Hardware Specification | No | All experiments are conducted on a server with 8 GPUs. |
| Software Dependencies | No | Our implementation is based on MMDetection3D [9]. This mentions a framework but does not provide specific version numbers for key software components or libraries. |
| Experiment Setup | Yes | Implementation Details. During the pre-training stage, we adopt a Res Net101-DCN [11] backbone initialized from an FCOS3D [54] checkpoint for the occupancy prediction task, while Res Net50 and Res Net101 backbones that are pre-trained with nu Images [3] for the 3D object detection task. The feature extraction employs a feature pyramid network [25] (FPN), producing multi-scale image representations at downsampling factors of 4, 8, 16, and 32. We configure the Gaussian counts to 25,600, and apply two transformer layers to enhance Gaussian attributes. Model training utilizes the Adam W [35] optimizer, with a 0.01 weight decay. The learning rate linearly warms up over the initial 500 steps to 2e-4 and then follows a cosine decay schedule. Pre-training is conducted for 20 epochs using a batch size of 8. Only random horizontal flipping data augmentation is included. Our implementation is based on MMDetection3D [9]. Fine-tuning follows the official downstream model configurations without modification. |