Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

HeadSculpt: Crafting 3D Head Avatars with Text

Authors: Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe Song, Tao Xiang, Kwan-Yee K. Wong

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We showcase Head Sculpt s superior fidelity and editing capabilities through comprehensive experiments and comparisons with existing methods. We will now assess the efficacy of our Head Sculpt across different scenarios, while also conducting a comparative analysis against state-of-the-art text-to-3D generation pipelines.
Researcher Affiliation Collaboration 1University of Surrey 2The University of Hong Kong 3Imperial College London 4i Fly Tek-Surrey Joint Research Centre on AI 5Surrey Institute for People-Centred AI
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper provides a webpage link (https://brandonhan.uk/Head Sculpt) but does not explicitly state that the source code for the described methodology is available there or elsewhere.
Open Datasets Yes Specifically, we employ a Control Net C trained on a large-scale 2D face dataset [86, 12], using facial landmarks rendered from Media Pipe [41, 31] as ground-truth data. To this end, we first randomly download 34 images of the back view of human heads, without revealing any personal identities, to construct a tiny dataset D, and then we optimize the special token v (i.e., <back-view>) to better fit the collected images, similar to the textual inversion [17]: We use the default training recipe provided by Hugging Face Diffusers 2, which took us 1 hour on a single Tesla V100 GPU. (Footnote 2 points to: https://github.com/huggingface/diffusers/blob/main/examples/textual_inversion)
Dataset Splits No The paper specifies training iterations and mentions testing/evaluation but does not explicitly detail a separate validation dataset split.
Hardware Specification Yes Hardware GPU 1 Tesla V100 (32GB)
Software Dependencies Yes Head Sculpt builds upon Stable-Dream Fusion [73] and Huggingface Diffusers [78, 53]. We utilize version 1.5 of Stable Diffusion [69] and version 1.1 of Control Net [84, 12] in our implementation.
Experiment Setup Yes Table 2: Hyper-parameters of Head Sculpt. Resolution for coarse (64, 64) Resolution for fine (512, 512) #Iterations for coarse 70k #Iterations for fine 50k Batch size 4 LR of grid encoder 1e-3 LR of Ne RF MLP 1e-3 LR of si and vi in DMTET 1e-2 LR scheduler constant Warmup iterations 20k Optimizer Adam (0.9, 0.99) Weight decay 0 Precision fp16