reproducibilityindex.ai

CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition

Authors: Zhonglin Sun, Siyang Song, Ioannis Patras, Georgios Tzimiropoulos

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that with a modest degree of similarity, training on the generated dataset can produce competitive performance compared to previous generation methods.
Researcher Affiliation	Academia	Zhonglin Sun Queen Mary University of London zhonglin.sun@qmul.ac.uk Siyang Song University of Exeter ss2796@cam.ac.uk Ioannis Patras Queen Mary University of London i.patras@qmul.ac.uk Georgios Tzimiropoulos Queen Mary University of London g.tzimiropoulos@qmul.ac.uk
Pseudocode	Yes	The pseudo-code for training and generation are given in Supplementary Material Section A.3. ... Algorithm 1 The training pipeline of our Cemi Face ... Algorithm 2 The pipeline of Cemi Face-based face dataset generation
Open Source Code	Yes	The code will be available at:https://github.com/szlbiubiubiu/Cemi Face
Open Datasets	Yes	We first split face images in the CASIA-Web Face [36] into various levels of groups... For example, with the same data volume, the model trained on the state-of-the-art synthetic dataset DCface [24] produces 11.23% lower verification performance on CFP-FP testset than the model with the same architecture trained on the real dataset. ... We employ 3 datasets for training:(a) CASIAWeb Face as used in DCFace; (b) A challenging in-the-wild dataset Flickr with 1.2M images collected by us from Flickr website; (c) VGGFace2 [13] which is a large-scale dataset containing 3.3M clean images.
Dataset Splits	No	The paper references evaluation datasets (LFW, CFP-FP, Age DB-30, CPLFW, CALFW) for testing performance, but it does not explicitly define a separate 'validation' split or dataset used during the training process with specific percentages or sample counts.
Hardware Specification	Yes	The batch size is 160 on 2 A100 GPUs.
Software Dependencies	No	The paper mentions several models, methods, and optimizers (e.g., 'Cos Face [1]', 'Adam W [44]', 'DDPM [21, 22]', 'DDIM [22]', 'Ada GN [42]', 'UNet [34]'), but it does not specify concrete version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions) required for replication.
Experiment Setup	Yes	Specifically, the margin of Cosface is 0.4, weight decay is 5e-4, learning rate is 1e-1 and is decayed by 10 at the 26th and 34th epoch, totally the model is trained for 40 epochs. We add random resize & crop with the scale of [0.9, 1.0], Random Erasing with the scale of [0.02,0.1], and random flip. Brightness, contrast, saturation and hue are all set to be 0.1. ... The maximum time step T for diffusion training is 1000. Then for generating the synthetic face recognition dataset, the time step for DDIM [22] is 20. ... Specifically, in the mini-batch, we assign a randomly selected m from -1 to 1 with an interval of 0.02