MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

Authors: Yuedong Chen, Chuanxia Zheng, Haofei Xu, Bohan Zhuang, Andrea Vedaldi, Tat-Jen Cham, Jianfei Cai

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate MVSplat360 s performance, we introduce a new benchmark using the challenging DL3DV-10K dataset, where MVSplat360 achieves superior visual quality compared to state-of-the-art methods on wide-sweeping or even 360 NVS tasks. Experiments on the existing benchmark Real Estate10K also confirm the effectiveness of our model.
Researcher Affiliation Academia 1Monash University 2VGG, University of Oxford 3ETH Zurich 4University of Tübingen, Tübingen AI Center 5Nanyang Technological University
Pseudocode No The paper describes the methodology in text and block diagrams but does not include formal pseudocode or algorithm blocks.
Open Source Code Yes More implementation details can be found in Appendix B, and the codes are publicly available at https://github.com/donydchen/mvsplat360.
Open Datasets Yes To verify the effectiveness of MVSplat360 in synthesizing wide-sweeping and 360 novel views, we have established a challenging benchmark derived from DL3DV-10K [23]. ... We also assess our model on Real Estate10K [74], which contains real estate videos downloaded from You Tube.
Dataset Splits Yes For training, we use a subset in subfolders 3K and 4K , resulting in ~2,000 scenes. We tested on the 140 benchmark scenes and filtered them out from the training set to ensure correctness. For each scene, we selected 5 input views using farthest point sampling based on camera locations and evaluated 56 views by equally sampling from the remaining, yielding a total of 7,840 test views.
Hardware Specification Yes All models are trained for 100K steps with an effective batch size of 8 on 1 to 8 A100 GPUs, and we apply the gradient accumulation technique whenever needed.
Software Dependencies No MVSplat360 is implemented with Py Torch and a CUDA-implemented 3DGS renderer. While PyTorch and CUDA are mentioned, specific version numbers are not provided, preventing full reproducibility of software dependencies.
Experiment Setup Yes Our default model is trained with the Adam optimizer, and the learning rate is set to 1.e 5 and decayed with the one-cycle strategy. All models are trained for 100,000 steps with an effective batch size of 8 on 1 to 8 A100 GPUs, and we apply the gradient accumulation technique whenever needed.