3D Reconstruction with Generalizable Neural Fields using Scene Priors

Authors: Yang Fu, Shalini De Mello, Xueting Li, Amey Kulkarni, Jan Kautz, Xiaolong Wang, Sifei Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We mainly perform experiments on Scan Net V2 (Dai et al., 2017a) for both surface reconstruction and novel view synthesis tasks. Specifically, we first train the generalizable neural scene prior on the Scan Net V2 training set and then evaluate its performance in two testing splits proposed by Guo et al. (2022) and Wei et al. (2021) for surface reconstruction and novel view synthesis, respectively. We further perform the ablation studies to evaluate the effectiveness and the efficiency of the neural prior network.
Researcher Affiliation Collaboration Yang Fu1 Shalini De Mello2 Xueting Li2 Amey Kulkarni2 Jan Kautz2 Xiaolong Wang1 Sifei Liu2 1University of California, San Diego 2NVIDIA
Pseudocode Yes Algorithm 1: Prior-guided voxel pruning Input: Grid feature {fi}i=1:N; Grid position {xi}i=1:N; Positional encoding γ( ); Geometry decoder s( ); Number of grids N; Number of iterations T Output: Grid feature after prune {fj}i=1:M Initialization :τ0 = 0.16 for t = 1: T do τ = max(0.005, 0.8 20t T τ0) for i = 1 : N do si s(fi, γ(xi)); if |si| τ then Prune i-th grid end end end
Open Source Code No All experiments in this paper are reproducible. We are committed to releasing the source codes once accepted.
Open Datasets Yes We mainly perform experiments on Scan Net V2 (Dai et al., 2017a) for both surface reconstruction and novel view synthesis tasks. To further validate our method, we also conduct experiments on 10 synthetic scenes proposed by Azinovi c et al. (2022). Datasets Most of the experiments are conducted on Scan Net dataset and 10 synthetic scenes collected by Dai et al. (2017a) and Azinovi c et al. (2022) which are released on their official website and public to everyone for non-commercial use.
Dataset Splits No The paper mentions training and testing splits for datasets but does not explicitly specify a distinct validation set or its characteristics for reproducibility.
Hardware Specification Yes The geometric and texture priors network are trained on 8 NVIDIA V100 GPUs for 2 days until convergence. The per-scene optimization step is trained and tested on a single NVIDIA V100 GPU.
Software Dependencies No Our code is built upon the Pytorch Paszke et al. (2019). And we leverage the code from the released codes by nerfstudio Tancik et al. (2023) under the Apache License. No specific version numbers for PyTorch or nerfstudio are provided.
Experiment Setup Yes Then for each ray, we define a small truncation region near the ground-truth depth where 32 points are sampled uniformly. We then use two MLPs to map the geometry features to SDF values. The hyperparameters λdepth, λsdf and λeik are set to 1.0, 1.0 and 0.5, respectively. For each point, we utilize 2 MLPs in the texture decoder to estimate its RGB value. The hyperparamters λdepth, λsdf, λeik and λrgb are set to 1.0, 1.0, 0.5 and 10.0, respectively.