$\text{ID}^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition

Authors: Jianqing Xu, Shen Li, Jiaying Wu, Miao Xiong, Ailin Deng, Jiazhen Ji, Yuge Huang, Guodong Mu, Wenjie Feng, Shouhong Ding, Bryan Hooi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across five challenging benchmarks validate the advantages of ID3. Code is released at: https://github.com/hitspring2015/ID3-SFR.
Researcher Affiliation Collaboration 1Tencent Youtu Lab 2National University of Singapore
Pseudocode Yes Algorithm 1: Training Algorithm ... Algorithm 2: ID-Preserving Sampling Alg. ... Algorithm 3: Synthetic Dataset Generation
Open Source Code Yes Code is released at: https://github.com/hitspring2015/ID3-SFR.
Open Datasets Yes Training Dataset: We train our proposed ID3 on FFHQ (Karras et al., 2019) dataset. ... In order to compare with DCFace (Kim et al., 2023), we also train ID3 on CASIA-Web Face (Yi et al., 2014).
Dataset Splits No The paper mentions training on FFHQ and CASIA-Web Face and evaluating on several benchmarks, but does not specify explicit train/validation/test splits for the training datasets themselves.
Hardware Specification Yes All models are implemented with PyTorch and trained from scratch using 8 NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper mentions "implemented with PyTorch" but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes For our ID3, we implement the denoising network with a U-net architecture and the projection module with a three-layer perceptron (hidden-layer size (512, 256, 768)) with ReLU activation. ... we set λtκxt = 0.5 (1 1/(1 + exp ( t/T)) for the loss coefficients in Eq. (3), and use T = 1, 000 for the diffusion model; training batch size is set to 16 and the total training steps 500, 000. We directly use a pre-trained face recognition (FR) model sourced from pSp (Richardson et al., 2021) as the identity feature extractor. ... In addition, we set # of identity embeddings m = 25 in Eq. (9) for each ID and match their embeddings with randomly selected attributes as conditioning signals for the diffusion model. For face recognition, we use LResNet50-IR (Deng et al., 2019), a variant of ResNet (He et al., 2016), as the backbone framework and follow the original configurations.