reproducibilityindex.ai

TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields

Authors: Tianyu Huang, Yihan Zeng, Bowen Dong, Hang Xu, Songcen Xu, Rynson W. H. Lau, Wangmeng Zuo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our method achieves a potential open-vocabulary 3D generation capability.
Researcher Affiliation	Collaboration	Tianyu Huang1,3 Yihan Zeng2 Bowen Dong1 Hang Xu2 Songcen Xu2 Rynson W. H. Lau3 Wangmeng Zuo1 1Harbin Institute of Technology 2Huawei Noah s Ark Lab 3City University of Hong Kong
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link confirming that its source code is open-source or publicly available.
Open Datasets	Yes	We train Text Field3D with a large-scale 3D dataset Objaverse (Deitke et al., 2022), which collects over 800k 3D objects from various categories. For ablation studies, a clean 3D dataset Shape Net (Chang et al., 2015) is used to evaluate the generative quality.
Dataset Splits	Yes	The filtered data are split into training, validation, and testing sets in a ratio of 7:1:2.
Hardware Specification	Yes	The total training time is around 3 days and 1 day with 8 V100 GPUs, respectively.
Software Dependencies	No	The paper mentions software like "Blender" and models like "Style GAN", "Adam optimizer", "CLIP's Vi T-B/32", "BLIP-2", and "Mini GPT-4", but it does not provide specific version numbers for these software dependencies (e.g., Blender 2.93, PyTorch 1.x.x).
Experiment Setup	Yes	We use Adam (Kingma & Ba, 2014) optimizer and initialize the learning rate to 0.002. The training batch size is 64. We sample 8,192 points for each mesh object. And the image resolution is 512 512 for both rendering and generation. The loss weight parameters λpc = 0.01 and λgen = 2. ϵ1 and ϵ2 are hyperparameters, which are set as 0.0002 and 0.016. λbind is set as 0.1 in the experiment.