Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects
Authors: Mark H. Huang, Lin Geng Foo, Christian Theobalt, Ying Sun, De Wen Soh
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments This section evaluates our approach by outlining the evaluation protocol, describing the datasets for training and testing, comparing against state-of-the-art baselines, and conducting ablation studies to analyze each component s impact. 4.1 Experimental settings 4.2 Results 4.3 Ablations and Analysis |
| Researcher Affiliation | Academia | 1Singapore University of Technology and Design, Singapore 2Max Planck Institute for Informatics, Saarland Informatics Campus, Germany 3Institute for Infocomm Research (I2R) & Centre for Frontier AI Research, A*STAR, Singapore |
| Pseudocode | No | The paper describes its methodology in Section 3 and its subsections (3.1, 3.2, 3.3) using descriptive text and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Project page: markhh.com/Online Splatter In the NeurIPS Paper Checklist, question 5, the answer states: "Answer: [No] Justification: We provide reference implementation where possible." |
| Open Datasets | Yes | We train on 100K objects sampled from Objaverse [5, 4]. For evaluation, we use two datasets of unseen objects. First, we test on Google Scanned Objects (GSO) [7], rendering 36 frames per object using our training pipeline (each with distinct lighting and motion). Second, we assess generalization to real-world monocular videos with occlusions using the HO3D dataset [10], which contains hand-object interaction sequences. |
| Dataset Splits | Yes | For each test sequence of N frames {Vn}N n=1, we split the frames into two sets: Input frames (Vinput): A randomly sampled subset of N/2 frames used for input; Target frames (Vtarget): The remaining N/2 frames reserved for NVS-based evaluation. |
| Hardware Specification | Yes | We train our model on 8x NVIDIA A100 GPUs with 80GB memory. Our implementation uses Python 3.10, Py Torch 2.1.2, torchvision 0.16.2, and we leverage x Formers [19] 0.0.23 for efficient attention computation. Evaluation Environment: We run all inference on a single L40S GPU with 48GB memory. |
| Software Dependencies | Yes | Our implementation uses Python 3.10, Py Torch 2.1.2, torchvision 0.16.2, and we leverage x Formers [19] 0.0.23 for efficient attention computation. |
| Experiment Setup | Yes | Optimizer: We use Adam W [27] optimizer with learning rate 1e 4, weight decay 0.05, and Cosine Annealing [26] learning rate schedule. The learning rate is warmed up for 2000 steps. Warm-up Training Stage: Steps: Trained for 250K steps with effective batch size 64. Loss Weights: λg = 0.3, λbg = 0.3, λd = 0.5 Input Sampling: We sample 3 5 views per sequence sample, with a sampling schedule as described in Sec. D.1. Main Training Stage: Steps: Trained for 500K steps with effective batch size 16, incorporating the memory module. Loss Weights: λg = 0.3, λbg = 0.3, λd = 0.0 (Ldepth removed) Input Sampling: We sample 6 12 views per sequence sample, with a sampling schedule as described in Sec. D.1. |