SweetDreamer: Aligning Geometric Priors in 2D diffusion for Consistent Text-to-3D
Authors: Weiyu Li, Rui Chen, Xuelin Chen, Ping Tan
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present the qualitative and quantitative evaluation of the text-to-3D pipelines as described in Section 3.2, as well as comparison results against other text-to-3D baseline methods. |
| Researcher Affiliation | Collaboration | 1 Hong Kong University of Science and Technology 2 Light Illusions 3 South China University of Technology 4 Tencent AI Lab |
| Pseudocode | No | The paper describes the method in prose and provides diagrams, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | Yes | We implement it in the threestudio (Guo et al., 2023), which implemented a diverse set of state-of-the-art text-to-3D generation pipelines. |
| Open Datasets | Yes | We use a public 3D dataset Objaverse (Deitke et al., 2023), which contains around 800k models created by artists, to generate the data for fine-tuning. |
| Dataset Splits | No | The paper mentions using Objaverse for fine-tuning and then evaluating on 80 randomly selected text prompts. However, it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) for the Objaverse dataset itself as used in their experiments. |
| Hardware Specification | Yes | The entire fine-tuning process takes approximately 2 days using 8 V100 GPUs for 100k steps. |
| Software Dependencies | Yes | By default, we conduct experiments based on the Stable Diffusion model (we use v2.1) |
| Experiment Setup | Yes | We use the default parameters as in Diffusers, including setting the learning rate to 1e-5 with the constant scheduler, and a batch size of 96 per GPU with 4 gradient accumulation steps. |