Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MVDream: Multi-view Diffusion for 3D Generation
Authors: Yichun Shi, Peng Wang, Jianglong Ye, Long Mai, Kejie Li, Xiao Yang
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on three tasks: (1) multi-view image generation for evaluating image quality and consistency (Sec. 4.1), (2) 3D (Ne RF) generation with multi-view score distillation as a main downstream task (Sec. 4.2), and (3) Dream Booth for personalized 3D generation (Sec. 4.3). |
| Researcher Affiliation | Collaboration | 1 Byte Dance, USA, 2 University of California, San Diego EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Pseudocode for MVDream training |
| Open Source Code | No | Our project page is https://MV-Dream.github.io, Besides, we will release our code as well as model checkpoints publicly after the paper submission. The latter statement indicates a future release, not current availability, and the project page does not explicitly state it hosts the code for the paper. |
| Open Datasets | Yes | We fine-tune the open-sourced stable diffusion 2.1 model (sta) on the Objaverse dataset (Deitke et al., 2023) and LAION dataset (Schuhmann et al., 2022) for experiments. |
| Dataset Splits | Yes | We randomly choose 1,000 subjects from the held-out validation set and generate 4-view images using the given prompts and camera parameters. |
| Hardware Specification | Yes | The training takes about 3 days on 32 Nvidia Tesla A100 GPUs., The SDS process takes about 1.5 hour on a Tesla V100 GPU with shading and 1 hour without shading. |
| Software Dependencies | Yes | We fine-tune our model from the Stable Diffusion v2.1 base model (512 512 resolution) (sta)..., For multi-view SDS, we implement our multi-view diffusion guidance in the threestudio (thr) library... |
| Experiment Setup | Yes | We use a reduced image size of 256 256 and a total batch size of 1,024 (4,096 images) for training and fine-tune the model for 50,000 steps. ... The 3D model is optimized for 10,000 steps with an Adam W optimizer (Kingma & Ba, 2014) at a learning rate of 0.01. For SDS, the maximum and minimum time steps are decreased from 0.98 to 0.5 and 0.02, respectively, over the first 8,000 steps. |