DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Authors: Jingxiang Sun, Bo Zhang, Ruizhi Shao, Lizhen Wang, Wen Liu, Zhenda Xie, Yebin Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a comparative analysis of our technique against five baseline methods. The metrics are measured on 300 generated samples. The effect of 3D prior. ... an ablation study is conducted. The effect of BSD. Figure 4 also presents an ablation study.
Researcher Affiliation Collaboration Jingxiang Sun1 , Bo Zhang3 , Ruizhi Shao1, Lizhen Wang1, Wen Liu2, Zhenda Xie2, Yebin Liu1 1 Tsinghua University, 2 Deep Seek AI, 3 Zhejiang University
Pseudocode Yes Algorithm 1 Bootstrapped Score Distillation
Open Source Code Yes Code available at https://github.com/deepseek-ai/DreamCraft3D.
Open Datasets No We establish a test benchmark that includes 300 images... We intend to make this test benchmark accessible to the public.
Dataset Splits No The paper mentions establishing a test benchmark of 300 images and measuring metrics on 300 generated samples, but it does not specify explicit training, validation, or test dataset splits or percentages.
Hardware Specification Yes We conducted our timing tests using 8 A100 GPUs for training and a single A100 GPU for inference.
Software Dependencies No The paper mentions several software components, models, and frameworks (e.g., Instant NGP, Neu S, DMTet, Deep Floyd IF, Stable Diffusion, Zero-1-to-3, Dream Booth, Threestudio library), but it does not provide specific version numbers for these dependencies.
Experiment Setup Yes We set λrgb = 10000, λmask = 5000, λdepth = λnormal = 0.1, λhybrid = 1. In the geometry sculpting stage, ...optimizing from a 64 to a 384 resolution. For the textured mesh, we use DMTet at a 128 grid and 512 rendering resolution. At the start of optimization, we prioritize sampling larger diffusion timestep t from the range [0.7, 0.85]... linearly anneal the t sampling range to [0.2, 0.5] over hundreds of iterations. We linearly increase the sampling range of camera positions with elevation angle (ϕcam) from 0 to [ 10 , 45 ], and azimuth angle (θcam) from 0 to [ 180 , 180 ]. The progress length is set as 200 iterations.