Controllable 3D Face Generation with Conditional Style Code Diffusion
Authors: Xiaolong Shen, Jianxin Ma, Chang Zhou, Zongxin Yang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on FFHQ, Celeb A-HQ, and Celeb A-Dialog demonstrate the promising performance of our TEx-Face in achieving the efficient and controllable generation of photorealistic 3D faces. |
| Researcher Affiliation | Collaboration | Xiaolong Shen1,2*, Jianxin Ma2, Chang Zhou2, Zongxin Yang1 1 Re LER, CCAI, Zhejiang University, China 2 Alibaba Group, China {sxlongcs, zongxinyang}@zju.edu.cn, {majx13fromthu,ericzhou.zc}@alibaba-inc.com *Xiaolong Shen worked on this at his Alibaba internship. |
| Pseudocode | No | The paper describes the proposed methods in detail but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code will be available at https://github.com/sxl142/TEx-Face. |
| Open Datasets | Yes | We train our inversion model on FFHQ (Abdal, Qin, and Wonka 2019) and test it on Celeba A-HQ (Karras et al. 2018) test set. We use Celeb A-Dialog (Jiang et al. 2021) and some data processed by our proposed data augmentation strategy to train our diffusion model. |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly describe a validation set or specific split percentages for dataset partitioning (e.g., 80/10/10 split or specific sample counts for training, validation, and test). |
| Hardware Specification | Yes | We use four Nvidia Tesla V100 (16G) with batch size 8 to train our inversion model, and with batch size 256 for style code diffusion. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'Cosine Annealing scheduler' but does not specify version numbers for key software dependencies like programming languages (e.g., Python 3.8) or deep learning frameworks (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | We use Adam (Kingma and Ba 2017) optimizer with linear warm-up and Cosine Annealing (Loshchilov and Hutter 2017) scheduler. We set the loss weights as follows: λrec = 1, λlpips = 0.8, λid = 0.2. We use four Nvidia Tesla V100 (16G) with batch size 8 to train our inversion model, and with batch size 256 for style code diffusion. |