Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction

Authors: Maria Taktasheva, Lily Goli, Alessandro Fiorini, Zhen Li, Daniel Rebain, Andrea Tagliasacchi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluations demonstrate that this hybrid representation achieves state-of-the-art depth estimation results on challenging indoor datasets including the new Scan Net++ dataset which was designed for dense reconstruction tasks using Ne RF-based approaches, and the legacy Scan Netv2 dataset with sparser camera views. Our method delivers crisp reconstructed surfaces, while maintaining competitive visual quality compared to fully 3D representations. Beyond novel view synthesis, our approach has application in mesh extraction for planar surfaces, producing high-quality meshes and accurate mesh segmentation results across diverse capture setups (DSLR and i Phone captures), without the overfitting issues that negatively affect previous methods trained on specific camera models. Quantitative and qualitative results across both datasets show significant improvement in depth accuracy compared to all baselines. Notably, our method achieves comparable image quality to SOTA 3D representations on dense Scan Net++ scenes while surpassing them in depth quality, evidenced by sharper geometry reconstruction in qualitative examples. In the sparser Scan Netv2 scenes, our approach delivers superior performance in both depth and image quality, leveraging the planar prior of indoor environments to overcome the geometric ambiguity that challenges pure 3D methods in sparse captures. Our method also substantially outperforms 2DGS in both image fidelity and depth accuracy metrics.
Researcher Affiliation Academia Maria Taktasheva Simon Fraser University EMAIL Lily Goli University of Toronto Alessandro Fiorini University of Bologna Zhen Li Simon Fraser University Daniel Rebain University of British Columbia Andrea Tagliasacchi Simon Fraser University University of Toronto
Pseudocode No The paper describes the method's steps in structured prose and uses diagrams (e.g., Figure 2 for Overview Training) but does not include any explicitly labeled pseudocode or algorithm blocks with code-like formatting.
Open Source Code Yes Code We release our code3 publicly for reproducibility purposes and to facilitate future research in this area. We base our code on the 3DGS-MCMC paper [13] and additionally use SAMv2 [29], and Plane Rec Net [25] to generate masks. The baselines are evaluated using their official released code [7, 6, 16, 17, 13, 8, 9]. We further utilize Air Planes [9] code to compute meshing metrics. 3https://github.com/theialab/3dgs-flats
Open Datasets Yes Datasets We perform evaluations on common indoor scene benchmarks Scan Net++[31] and Scan Netv2[32], as they primarily feature indoor scenes with flat textureless surfaces suitable for the task at hand.
Dataset Splits Yes For Scan Net++, we use 11 training scenes with ground truth meshes for depth derivation, utilizing i Phone video streams, sampling every 10th frame for training at 2 downsampling and every 8th for testing. We chose the scenes that are diverse in their content and contain various planar surfaces. For Scan Net, we evaluate on 5 scenes with sufficient overlapping views of planar surfaces following the data preparation scheme of [27].
Hardware Specification Yes All experiments were conducted on a single A6000 ADA GPU, with 46GB memory.
Software Dependencies No The paper mentions using SAMv2 [29] and Plane Rec Net [25] for mask generation and bases its code on 3DGS-MCMC [13], but it does not specify version numbers for these software components or any other key libraries/frameworks.
Experiment Setup Yes We begin our optimization with a warm-up stage using only 3D Gaussians (for N=3500 iterations). After that, we begin our planar reconstruction where in each round of optimization we: (i) dynamically initialize plane parameters by robustly fitting planes to the current representation (section 3.2); (ii) alternate between optimizing plane and Gaussian parameters (section 3.2); (iii) densify our representation through a (slightly modified) MCMC densification, due to the challenges of optimizing compact-support functions (section 3.4). We optimize our representation by block-coordinate descent, starting each round of optimization by only optimizing the plane parameters for a fixed number of 10 iterations, and then freezing these, and optimizing the Gaussian parameters (both 2D and 3D) for another 100 iterations. We use σ = 0.01 and σ = 0.3. We observe that setting λmask = 0.1, yields best results empirically. For regularizers, we use λTV=0.1, λscale=0.01 and λopacity = 0.01 following [10] and [13]. We use the same scheduling policy for learning plane origin and normal (rotation) as for the Gaussian means the vanilla 3DGS.