MeshFormer : High-Quality Mesh Generation with 3D-Guided Reconstruction Model

Authors: Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, Hongzhi Wu, Hao Su

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the methods on two datasets: GSO [11] and Omni Object3D [83]. Both datasets contain real-scanned 3D objects that were not seen during training. For the GSO dataset, we use all 1,030 3D shapes for evaluation. ... In Fig. 3, we showcase qualitative examples. Our Mesh Former produces the most accurate meshes with fine-grained, sharp geometric details. ... In Tab. 1. Although our baselines include four methods released just one or two months before the time of submission, our Mesh Former significantly outperforms many of them and achieves the best performance on most metrics across two datasets.
Researcher Affiliation Collaboration Minghua Liu 1,2 Chong Zeng 3 Xinyue Wei1,2 Ruoxi Shi1,2 Linghao Chen2,3 Chao Xu2,4 Mengqi Zhang2 Zhaoning Wang5 Xiaoshuai Zhang1,2 Isabella Liu1 Hongzhi Wu3 Hao Su1,2 1 UC San Diego 2 Hillbot Inc. 3 Zhejiang University 4 UCLA 5 University of Central Florida
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The code and model release have not yet passed our internal inspection, and the model also needs a safety evaluation.
Open Datasets Yes We trained Mesh Former on the Objaverse [9] dataset. We evaluate the methods on two datasets: GSO [11] and Omni Object3D [83].
Dataset Splits No We trained Mesh Former on the Objaverse [9] dataset. The total number of network parameters is approximately 648 million. We trained the model using 8 H100 GPUs for about one week (350k iterations) with a batch size of 1 per GPU, although we also show that the model can achieve similar results in just two days. ... We evaluate the methods on two datasets: GSO [11] and Omni Object3D [83]. Both datasets contain real-scanned 3D objects that were not seen during training. For the GSO dataset, we use all 1,030 3D shapes for evaluation. For the Omni Object3D dataset, we randomly sample up to 5 shapes from each category, resulting in 1,038 shapes for evaluation.
Hardware Specification Yes We trained the model using 8 H100 GPUs for about one week (350k iterations) with a batch size of 1 per GPU, although we also show that the model can achieve similar results in just two days. ... Our main model is trained using 8 H100 GPUs for one week. All experiments listed in the paper can be completed in 15 days using 32 H100 GPUs (running multiple parallel experiments), excluding the preliminary exploration experiments.
Software Dependencies No The paper mentions software components such as 'CUDA-based program', 'Blender Proc', 'NVDiff Rast [25]', and 'Zero123++ v1.2 [59]'. While Zero123++ has a version, it is a specific model rather than a general software dependency. Crucially, the paper does not specify version numbers for general programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other key libraries required to replicate the environment and experiments.
Experiment Setup Yes We trained the model using 8 H100 GPUs for about one week (350k iterations) with a batch size of 1 per GPU, although we also show that the model can achieve similar results in just two days. ... The model is trained with the Adam optimizer and a cosine learning rate scheduler. The loss weights λ1, , λ6 are set to 80, 2, 16, 2, 8, and 8, respectively.