4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization

Authors: Mijeong Kim, Jongwoo Lim, Bohyung Han

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section compares the proposed method, referred to as UA-4DGS, with existing 4D Gaussian splatting algorithms including D-3DGS [68], Zhan et al. [31], and 4DGS [61]. Our method is implemented based on the official code of 4DGS [61] and tested on a single RTX A5000 GPU. Table 1 presents the quantitative comparison of our algorithm against existing methods based on 4D Gaussian Splatting [68, 61, 31] and MLPs [39] on the Dy Check dataset [14].
Researcher Affiliation Academia Mijeong Kim1 mijeong.kim@snu.ac.kr Jongwoo Lim2, 3 jongwoo.lim@snu.ac.kr Bohyung Han1,3 bhhan@snu.ac.kr 1ECE, 2ME, and 3IPAI, Seoul National University, South Korea
Pseudocode No The paper describes methods and processes but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a pseudocode-like format.
Open Source Code No Our method is implemented based on the publicly available official code 1 of 4D Gaussian Splatting (4DGS) [61], using Py Torch [40]. For experiments on the LLFF dataset, our method is implemented based on the publicly available official code 3 of FSGS [72]. Footnotes 1 and 3 link to other projects' GitHub repositories, not a distinct repository for the authors' specific contributions.
Open Datasets Yes Our primary goal is to reconstruct dynamic scenes from casually recorded monocular videos, for which we use Dy Check [14] as our main dataset. This dataset consists of monocular videos captured with a single handheld camera, featuring scenes with fast motion to provide a challenging and realistic scenario for dynamic scene reconstruction. The Dy Check dataset includes 14 videos; however, only the half of scenes apple, block, paper-windmill, teddy, space-out, spin, and wheel are suitable for evaluation due to the availability of held-out views. (Footnote 2: https://github.com/KAIR-BAIR/dycheck) and We test both FSGS and UA-3DGS on the LLFF dataset [33] using three training images with five different runs.
Dataset Splits No The Dy Check dataset includes 14 videos; however, only the half of scenes apple, block, paper-windmill, teddy, space-out, spin, and wheel are suitable for evaluation due to the availability of held-out views. The paper mentions 'training frames' and 'unseen views' for evaluation, but no explicit percentages or counts for training/validation/test splits are provided for any dataset.
Hardware Specification Yes Our method is implemented based on the official code of 4DGS [61] and tested on a single RTX A5000 GPU.
Software Dependencies No Our method is implemented based on the publicly available official code 1 of 4D Gaussian Splatting (4DGS) [61], using Py Torch [40]. The reference [40] for PyTorch does not specify a version number.
Experiment Setup Yes We train our model for 40,000 iterations, where uncertainty-aware regularization is applied starting from iteration 20,000, as the refined images from diffusion model and uncertainty maps become more reliable at this stage. The coefficients (c0, c1) for the sigmoid function are set as 0.25 and 20/L, respectively, where L is the number of training images. We set the balance weights for λdata, λUA-diff, and λUA-TV as 0.5, 0.2, and 0.01, respectively. For experiments on the LLFF dataset, our method is implemented based on the publicly available official code 3 of FSGS [72]. We set the balance weights for λUA-diff, and λUA-TV as 0.1 and 0.001, respectively, for optimal performance, applying the same hyperparameters across all scenes. All experiments are conducted in the Vessl environment [1]. We utilize the Adam optimizer and set the resolution of the Hexplane grid to (64, 64, 64, 150). For grid smoothness in the Hexplane, we follow the default value of 4DGS.