Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
Authors: Kailu Wu, Fangfu Liu, Zhihan Cai, Runjie Yan, Hanyang Wang, Yating Hu, Yueqi Duan, Kaisheng Ma
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our Unique3D significantly outperforms other image-to-3D baselines in terms of geometric and textural details. |
| Researcher Affiliation | Collaboration | Kailu Wu Tsinghua University Fangfu Liu Tsinghua University Zhihan Cai Tsinghua University Runjie Yan Tsinghua University Hanyang Wang Tsinghua University Yating Hu AVAR Inc. Yueqi Duan Tsinghua University Kaisheng Ma Tsinghua University |
| Pseudocode | Yes | Algorithm 1 Color Completion Algorithm |
| Open Source Code | Yes | Project page: https://wukailu.github.io/Unique3D/. |
| Open Datasets | Yes | Utilizing a subset of the Objaverse dataset as delineated by LGM [53], we apply a rigorous filtration process to exclude scenes containing multiple objects, low-resolution imagery, and unidirectional faces, leading to a refined dataset of approximately 50k objects. |
| Dataset Splits | No | The paper uses the Objaverse dataset for training and Google Scanned Objects (GSO) for evaluation, but does not specify train/validation/test splits (e.g., percentages or counts) for its own model training. |
| Hardware Specification | Yes | The entire training takes around 4 days on 8 NVIDIA RTX4090 GPUs. |
| Software Dependencies | No | The initial level of image generation is initialized with the weight of the Stable Diffusion Image Variations Model [40], while the subsequent level employs an upscaled version fine-tuned from Control Net-Tile [70]. The final stage uses the pre-trained Real-ESRGAN model [58]. |
| Experiment Setup | Yes | The reconstruction process involves 300 iterations using the SGD optimizer [3], with a learning rate of 0.3. The weight of expansion regularization is set to 0.1. Subsequent refinement takes 100 iterations, maintaining the same optimization parameters. |