reproducibilityindex.ai

Controllable 3D Face Generation with Conditional Style Code Diffusion

Authors: Xiaolong Shen, Jianxin Ma, Chang Zhou, Zongxin Yang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on FFHQ, Celeb A-HQ, and Celeb A-Dialog demonstrate the promising performance of our TEx-Face in achieving the efficient and controllable generation of photorealistic 3D faces.
Researcher Affiliation	Collaboration	Xiaolong Shen1,2, Jianxin Ma2, Chang Zhou2, Zongxin Yang1 1 Re LER, CCAI, Zhejiang University, China 2 Alibaba Group, China {sxlongcs, zongxinyang}@zju.edu.cn, {majx13fromthu,ericzhou.zc}@alibaba-inc.com Xiaolong Shen worked on this at his Alibaba internship.
Pseudocode	No	The paper describes the proposed methods in detail but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code will be available at https://github.com/sxl142/TEx-Face.
Open Datasets	Yes	We train our inversion model on FFHQ (Abdal, Qin, and Wonka 2019) and test it on Celeba A-HQ (Karras et al. 2018) test set. We use Celeb A-Dialog (Jiang et al. 2021) and some data processed by our proposed data augmentation strategy to train our diffusion model.
Dataset Splits	No	The paper mentions training and test sets but does not explicitly describe a validation set or specific split percentages for dataset partitioning (e.g., 80/10/10 split or specific sample counts for training, validation, and test).
Hardware Specification	Yes	We use four Nvidia Tesla V100 (16G) with batch size 8 to train our inversion model, and with batch size 256 for style code diffusion.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'Cosine Annealing scheduler' but does not specify version numbers for key software dependencies like programming languages (e.g., Python 3.8) or deep learning frameworks (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	We use Adam (Kingma and Ba 2017) optimizer with linear warm-up and Cosine Annealing (Loshchilov and Hutter 2017) scheduler. We set the loss weights as follows: λrec = 1, λlpips = 0.8, λid = 0.2. We use four Nvidia Tesla V100 (16G) with batch size 8 to train our inversion model, and with batch size 256 for style code diffusion.