UniTSFace: Unified Threshold Integrated Sample-to-Sample Loss for Face Recognition

Authors: qiufu li, Xi Jia, Jiancan Zhou, Linlin Shen, Jinming Duan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluation on multiple benchmark datasets, including MFR, IJB-C, LFW, CFP-FP, Age DB, and Mega Face, demonstrates that the proposed USS loss is highly efficient and can work seamlessly with sample-to-class-based losses. The embedded loss (USS and sample-to-class Softmax loss) overcomes the pitfalls of previous approaches and the trained facial model Uni TSFace exhibits exceptional performance, outperforming state-of-the-art methods, such as Cos Face, Arc Face, VPL, Anchor Face, and UNPG.
Researcher Affiliation Collaboration 1 National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, China 2 Computer Vision Institute, Shenzhen University, China 3 School of Computer Science, University of Birmingham, UK 4 Aqara, Lumi United Technology Co., Ltd, China 5 Alan Turing Institute, UK 6 SZU Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/CVI-SZU/Uni TSFace.
Open Datasets Yes We utilize four publicly available datasets for training, namely, CASIA-Web Face(33) (consisting of 0.5 million images of 10K identities), Glint360K(2) (comprising 17.1 million images of 360K identities), Web Face42M(35) (containing 42.5 million images of 2 million identities), and Web Face4M, which is a subset of Web Face42M with 4.2 million images of 0.2 million identities.
Dataset Splits Yes For the MFR Ongoing Challenge, the trained models are submitted to and evaluated by the online server. Specifically, we report 1:1 verification accuracy for LFW, CFP-FP, and Age DB. We report True Accept Rate (TAR) at False Accept Rate (FAR) levels of 1e-4 and 1e-5 for IJB-C. We report TARs at FAR=1e-4 for the Mask and Children test sets, and TARs at FAR=1e-6 for the GMR test sets. For the Mega Face Challenge 1, we report Rank1 accuracy for identification and TAR at FAR=1e-6 for verification. ... For example, when reporting the 1:1 verification accuracy on LFW, CFP-FP, Age DB, 10-fold validation is used. We first select the threshold that achieves the highest accuracy in the first 9 folds and then adopt this threshold to calculate the accuracy in the leave-out fold.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions "Pytorch" and "Retina Face" but does not provide specific version numbers for these software components.
Experiment Setup Yes We adopt customized Res Nets as our backbone following (7). We implement all models using Pytorch and train them using the SGD optimizer with a weight decay of 5e-4 and momentum of 0.9. For the face models on CASIA-Web Face, we train them over 28 epochs with a batch size of 512. The learning rate starts at 0.1 and is reduced by a factor of 10 at the 16th and 24th epoch. For both Glint360K and Web Face4M, we train the Res Nets for 20 epochs using a batch size of 1024. The learning rate is initially set at 0.1, while a polynomial decay strategy (power=2) is applied for the learning rate schedule. In the case of Web Face42M, we train the Res Nets for 20 epochs, using a larger batch size of 4096. The learning rate linearly warms up from 0 to 0.4 during the first epoch, followed by a polynomial decay (power=2) for the remaining 19 epochs. We include the detailed settings of all hyper-parameters used in Sec. 4 and Sec. 5 in the appendix for further reference.