Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance
Authors: meng wang, Fan Wu, Ruihui Li, Qin Yunchuan, Zhuo Tang, Li Ken Li
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Flow Scene achieves state-of-the-art performance, with m Io U of 17.70 and 20.81 on the Semantic KITTI and SSCBench-KITTI-360 benchmarks. |
| Researcher Affiliation | Academia | 1College of Computer Science and Electronic Engineering, Hunan University 2State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University EMAIL |
| Pseudocode | Yes | Algorithm 1 Inference algorithm of flow-guided temporal alignment Algorithm 2 Inference algorithm of occlusion-guided voxel refinement |
| Open Source Code | Yes | Project Page: https://github.com/willemeng/Flow Scene |
| Open Datasets | Yes | To assess the effectiveness of our Flow Scene, we conducted thorough experiments using the large outdoor datasets Semantic KITTI [1, 5], SSCBench-KITTI-360 [16, 19]. |
| Dataset Splits | Yes | The Semantic KITTI[1, 5] dataset ... It consists of 10 training sequences, 1 validation sequence, and 11 testing sequences. RGB images are resized to 1280 384 for input processing. The SSCBench-KITTI-360[16, 19] dataset contains 7 training sequences, 1 validation sequence, and 1 testing sequence, covering 19 semantic classes in total. |
| Hardware Specification | Yes | All models are trained on two A100 Nvidia GPUs with 80G memory and batch size 4. |
| Software Dependencies | No | The paper mentions software components like Rep Vi T [35], FPN [20], GMFlow [41], LSS view transformation [25], and Adam W optimizer, but it does not specify their version numbers, which is required for a reproducible description. |
| Experiment Setup | Yes | The number of historical temporal frames n is set to 2. We use and freeze the GMFlow [41] optical flow estimation model to obtain optical flow information. We use the LSS paradigm for 2D-3D projection. The neighborhood cross-attention range is set to 7, and the number of attention heads is set to 8. Finally, the final outputs of Semanti KITTI is 20 classes, and SSCBench-KITTI-360 is 19 classes. All datasets have the scene size of 51.2m 51.2m 64m with the voxel grid size of 256 256 32. By default, the model is trained for 25 epochs. We optimise the process, utilizing the Adam W optimizer with an initial learning rate of 1e-4 and a weight decay of 0.01. We also employ a multi-step scheduler to reduce the learning rate. |