Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields

Authors: Keqiang Sun, Shangzhe Wu, Zhaoyang Huang, Ning Zhang, Quan Wang, Hongsheng Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments, Datasets. We train our model on the Celeb A [24] dataset, which consists of 200k celebrity face images with various poses, expressions, and lighting conditions., Metrics. (1) Disentanglement Score (DS). ... (5) Frechet Inception Distance (FID)., 4.2 Quantitative Evaluation, 4.3 Ablation Study
Researcher Affiliation Collaboration Keqiang Sun1 , Shangzhe Wu2 , Zhaoyang Huang1, Ning Zhang3, Quan Wang3, Hongsheng Li1,4 1CUHK MMLab 2Oxford VGG 3Sense Time Research 4Centre for Perceptual and Interactive Intelligence Limited
Pseudocode No The paper describes its methodology through narrative text and diagrams (e.g., Figure 2: Method overview), but it does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Find code and more demo at https://keqiangsun.github.io/projects/cgof. and We release the code and instructions at https://keqiangsun.github.io/projects/cgof.
Open Datasets Yes Datasets. We train our model on the Celeb A [24] dataset, which consists of 200k celebrity face images with various poses, expressions, and lighting conditions.
Dataset Splits No The paper mentions using Celeb A and Celeb A-HQ for training and finetuning, but it does not explicitly provide the specific percentages or sample counts for training, validation, and test splits used in their experiments.
Hardware Specification Yes The final model is trained for 72 hours on 8 Ge Force GTX TITAN X GPUs.
Software Dependencies No The paper states 'We build our model on top of the implementation of pi-GAN [4], which uses a SIREN-based Ne RF representation,' but it does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We provide implementation details in the supplementary material, including network architectures and hyper-parameters. Here, we highlight a few important ones. We build our model on top of the implementation of pi-GAN [4], which uses a SIREN-based Ne RF representation. We sample Nsurf = 12 points around the 3DMM input mesh and Nvol = 12 coarse points for the density regularizer. The final model is trained for 72 hours on 8 Ge Force GTX TITAN X GPUs.