DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos

Authors: Linhan Wang, Kai Cheng, Shuo Lei, Shengkun Wang, Wei Yin, Chenyang Lei, Xiaoxiao Long, Chang-Tien Lu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on self-captured and public dash cam videos show that our method not only achieves state-of-the-art performance in novel view synthesis, but also accurately reconstructing captured scenes getting rid of obstructions. See the project page for code, data: https://linhanwang.github.io/dcgaussian/.
Researcher Affiliation Academia Linhan Wang1 Kai Cheng3 Shuo Lei1 Shengkun Wang1 Wei Yin5 Chenyang Lei4 Xiaoxiao Long2 Chang-Tien Lu1 1Virginia Tech 2Hong Kong University 3USTC 4CAIR, HKISI-CAS 5University of Adelaide
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes See the project page for code, data: https://linhanwang.github.io/dcgaussian/.
Open Datasets Yes To evaluate the performance of our method and baselines, we adopt BDD100K [59] for evaluation. This dataset contains 8 scenes that are from dash cam videos captured in daily life. They contain common obstructions, such as reflections, mobile phone holders, stickers and stains. Evaluation on this dataset reflects performance on real life dash cam videos.
Dataset Splits No To evaluate the performance of novel view synthesis, following common settings [3], we select one of every eight images as testing images and the remaining ones for training.
Hardware Specification Yes We run all the experiments with an A100 GPU. Each scene contains approximately 300 images in our evaluation datasets. Combined training of the 3DGS and IOM for 30k iterations with Adam optimizer takes about 30 minutes. It takes 40 minutes in total to evaluate MVS and run geometry filtering. ... DC-Gaussian achieves 120 fps at a resolution of 1920x1080 on an RTX 3090 GPU.
Software Dependencies No We develop our method based on 3DGS. We borrow multi-resolution hash encoding and fast MLP implementation from tiny-cuda-nn [31] to build IOM. We choose Patch Match Net [48] as the MVS method in G3E. We utilize the popular tools COLMAP [37] and HLoc [35, 26] to estimate the camera parameters.
Experiment Setup Yes We use 0.001 for both λ1 and λ2. We run all the experiments with an A100 GPU. Each scene contains approximately 300 images in our evaluation datasets. Combined training of the 3DGS and IOM for 30k iterations with Adam optimizer takes about 30 minutes. It takes 40 minutes in total to evaluate MVS and run geometry filtering. ... Here Ij is the jth input image. Ij o and Ij t are synthesized by trained IOM and 3DGS, respectively. Eq. 7 is derived from Eq. 4. We use 0.5 for the threshold τ in all the experiments.