Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly

Authors: Junsheng Zhou, Yu-Shen Liu, Zhizhong Han

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct evaluations on various datasets, and report analysis, numerical and visual comparisons with the latest methods to show our superiority.
Researcher Affiliation Academia Junsheng Zhou1 Yu-Shen Liu1 Zhizhong Han2 School of Software, Tsinghua University, Beijing, China1 Department of Computer Science, Wayne State University, Detroit, USA2
Pseudocode No The paper describes its methods through textual explanations and figures (e.g., Figure 2), but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes We provide our demonstration code as a part of our supplementary materials. We will release the source code, data and instructions upon acceptance.
Open Datasets Yes We evaluate deep prior assembly under four widely-used 3D scene reconstruction benchmarks 3D-Front [17], Replica [55], Blend Swap [2] and Scan Net [11].
Dataset Splits No The paper explicitly mentions using a "test set" for evaluations (e.g., "randomly select 1,000 scene images from the test set"), but does not specify a separate validation split or its proportion.
Hardware Specification Yes The total 1,000 iterations take 9.2 seconds on a single 3090 GPU.
Software Dependencies No The paper references specific models like Grounded-SAM [29, 33], Stable-Diffusion [51], Open-CLIP [47, 23], Shap E [26], and Omnidata [13], but does not provide specific version numbers for software dependencies or libraries (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes The number M of samples generated by Stable-Diffusion for each instance is set to 6, where we select the Top K = 3 samples with Open-CLIP. The pose/scale optimization is repeated for r = 10 times for each instance with RANSAC-like solution.