3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

Authors: Xi Liu, Chaoyi Zhou, Siyu Huang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on large-scale datasets of unbounded scenes demonstrate that 3DGS-Enhancer yields superior reconstruction performance and high-fidelity rendering results compared to state-of-the-art methods. The project webpage is https://xiliu8006.github.io/3DGS-Enhancer-project.
Researcher Affiliation Academia Xi Liu* Chaoyi Zhou* Siyu Huang Visual Computing Division School of Computing Clemson University {xi9, chaoyiz, siyuh}@clemson.edu
Pseudocode No The paper describes methods and processes but does not include any pseudocode or algorithm blocks.
Open Source Code Yes The code and the generated dataset will be publicly available. The project webpage is https://xiliu8006.github.io/3DGS-Enhancer-project.
Open Datasets Yes In experiments, we generate large-scale datasets with pairs of low-quality and high-quality images on hundreds of unbounded scenes, based on DL3DV [21]
Dataset Splits No The paper specifies training and test sets but does not explicitly mention or detail a separate validation set or split for the experiments.
Hardware Specification Yes The training is conducted on 2 NVIDIA A100-80G GPUs over 3 days.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as programming languages (e.g., Python) or libraries (e.g., PyTorch, CUDA).
Experiment Setup Yes Our video diffusion model is fine-tuned with a learning rate of 0.0001, incorporating 500 steps for warm-up, followed by a total of 80,000 training steps. The batch size is set to 1 in each GPU, where each batch consisted of 25 images at 512x512 resolution. To optimize the training process, the Adam optimizer is employed. Additionally, a dropout rate of 0.1 is applied to the conditions between the first and last frames and the training process utilize CFG (classifier-free guidance) to train the diffusion model. The STD is fine-tuned with a learning rate of 0.0005 and 50,000 training steps.