$\text{ID}^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition
Authors: Jianqing Xu, Shen Li, Jiaying Wu, Miao Xiong, Ailin Deng, Jiazhen Ji, Yuge Huang, Guodong Mu, Wenjie Feng, Shouhong Ding, Bryan Hooi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across five challenging benchmarks validate the advantages of ID3. Code is released at: https://github.com/hitspring2015/ID3-SFR. |
| Researcher Affiliation | Collaboration | 1Tencent Youtu Lab 2National University of Singapore |
| Pseudocode | Yes | Algorithm 1: Training Algorithm ... Algorithm 2: ID-Preserving Sampling Alg. ... Algorithm 3: Synthetic Dataset Generation |
| Open Source Code | Yes | Code is released at: https://github.com/hitspring2015/ID3-SFR. |
| Open Datasets | Yes | Training Dataset: We train our proposed ID3 on FFHQ (Karras et al., 2019) dataset. ... In order to compare with DCFace (Kim et al., 2023), we also train ID3 on CASIA-Web Face (Yi et al., 2014). |
| Dataset Splits | No | The paper mentions training on FFHQ and CASIA-Web Face and evaluating on several benchmarks, but does not specify explicit train/validation/test splits for the training datasets themselves. |
| Hardware Specification | Yes | All models are implemented with PyTorch and trained from scratch using 8 NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions "implemented with PyTorch" but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For our ID3, we implement the denoising network with a U-net architecture and the projection module with a three-layer perceptron (hidden-layer size (512, 256, 768)) with ReLU activation. ... we set λtκxt = 0.5 (1 1/(1 + exp ( t/T)) for the loss coefficients in Eq. (3), and use T = 1, 000 for the diffusion model; training batch size is set to 16 and the total training steps 500, 000. We directly use a pre-trained face recognition (FR) model sourced from pSp (Richardson et al., 2021) as the identity feature extractor. ... In addition, we set # of identity embeddings m = 25 in Eq. (9) for each ID and match their embeddings with randomly selected attributes as conditioning signals for the diffusion model. For face recognition, we use LResNet50-IR (Deng et al., 2019), a variant of ResNet (He et al., 2016), as the backbone framework and follow the original configurations. |