HeadSculpt: Crafting 3D Head Avatars with Text

Authors: Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We showcase Head Sculpt s superior fidelity and editing capabilities through comprehensive experiments and comparisons with existing methods. We will now assess the efficacy of our Head Sculpt across different scenarios, while also conducting a comparative analysis against state-of-the-art text-to-3D generation pipelines.
Researcher Affiliation Collaboration 1University of Surrey 2The University of Hong Kong 3Imperial College London 4i Fly Tek-Surrey Joint Research Centre on AI 5Surrey Institute for People-Centred AI
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper provides a webpage link (https://brandonhan.uk/Head Sculpt) but does not explicitly state that the source code for the described methodology is available there or elsewhere.
Open Datasets Yes Specifically, we employ a Control Net C trained on a large-scale 2D face dataset [86, 12], using facial landmarks rendered from Media Pipe [41, 31] as ground-truth data. To this end, we first randomly download 34 images of the back view of human heads, without revealing any personal identities, to construct a tiny dataset D, and then we optimize the special token v (i.e., <back-view>) to better fit the collected images, similar to the textual inversion [17]: We use the default training recipe provided by Hugging Face Diffusers 2, which took us 1 hour on a single Tesla V100 GPU. (Footnote 2 points to: https://github.com/huggingface/diffusers/blob/main/examples/textual_inversion)
Dataset Splits No The paper specifies training iterations and mentions testing/evaluation but does not explicitly detail a separate validation dataset split.
Hardware Specification Yes Hardware GPU 1 Tesla V100 (32GB)
Software Dependencies Yes Head Sculpt builds upon Stable-Dream Fusion [73] and Huggingface Diffusers [78, 53]. We utilize version 1.5 of Stable Diffusion [69] and version 1.1 of Control Net [84, 12] in our implementation.
Experiment Setup Yes Table 2: Hyper-parameters of Head Sculpt. Resolution for coarse (64, 64) Resolution for fine (512, 512) #Iterations for coarse 70k #Iterations for fine 50k Batch size 4 LR of grid encoder 1e-3 LR of Ne RF MLP 1e-3 LR of si and vi in DMTET 1e-2 LR scheduler constant Warmup iterations 20k Optimizer Adam (0.9, 0.99) Weight decay 0 Precision fp16