TopoFR: A Closer Look at Topology Alignment on Face Recognition

Authors: Jun Dan, Yang Liu, Jiankang Deng, Haoyu Xie, Siyuan Li, Baigui Sun, Shan Luo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on popular face benchmarks demonstrate the superiority of our Topo FR over the state-of-the-art methods. Code and models are available at: https://github. com/modelscope/facechain/tree/main/face_module/Topo FR. 5 Experiments 5.1 Datasets. i) For training, we employ three distinct datasets, namely MS1MV2 [1] (5.8M facial images, 85K identities), Glint360K [41] (17.1M facial images, 360K identities), and Web Face42M [80] dataset (42.5M facial images, 2M identities). ii) For evaluation, we adopt LFW [81], Age DB-30 [82], CFP-FP [83], CPLFW [84], CALFW [85], IJB-C [23], IJB-B [86] and the ICCV-2021 Masked Face Recognition Challenge (MFR-Ongoing) [27] as the benchmarks to test the performance of our models.
Researcher Affiliation Collaboration Jun Dan*1,2, Yang Liu*2,3, Jiankang Deng4, Haoyu Xie2, Siyuan Li2, Baigui Sun 2,5, Shan Luo3 1Zhejiang University 2Face Chain Community 3King s College London 4Imperial College London 5Alibaba Group danjun@zju.edu.cn, {yang.15.liu, shan.luo}@kcl.ac.uk, j.deng16@imperial.ac.uk xiehaoyu.xhy@alibaba-inc.com, sunbaigui85@gmail.com
Pseudocode No The architecture of our Topo FR model is depicted in Figure 3. It consists of two components: a feature extractor F and an image classifier C.
Open Source Code Yes Code and models are available at: https://github. com/modelscope/facechain/tree/main/face_module/Topo FR.
Open Datasets Yes 5.1 Datasets. i) For training, we employ three distinct datasets, namely MS1MV2 [1] (5.8M facial images, 85K identities), Glint360K [41] (17.1M facial images, 360K identities), and Web Face42M [80] dataset (42.5M facial images, 2M identities).
Dataset Splits No 5.1 Datasets. i) For training, we employ three distinct datasets, namely MS1MV2 [1] (5.8M facial images, 85K identities), Glint360K [41] (17.1M facial images, 360K identities), and Web Face42M [80] dataset (42.5M facial images, 2M identities). ii) For evaluation, we adopt LFW [81], Age DB-30 [82], CFP-FP [83], CPLFW [84], CALFW [85], IJB-C [23], IJB-B [86] and the ICCV-2021 Masked Face Recognition Challenge (MFR-Ongoing) [27] as the benchmarks to test the performance of our models.
Hardware Specification Yes A.1 Implementation Details Training Details. For MS1MV2 and Glint360K, our models are trained using Pytorch on 4 NVIDIA Tesla A100 GPUs, and a mini-batch of 128 images is assigned for each GPU. In the case of Web Face42M, we train our models (Res Net-200 backbone) using 64 NVIDIA Tesla V100 GPUs.
Software Dependencies No A.1 Implementation Details Training Details. For MS1MV2 and Glint360K, our models are trained using Pytorch on 4 NVIDIA Tesla A100 GPUs...
Experiment Setup Yes A.1 Implementation Details Training Details. For MS1MV2 and Glint360K, our models are trained using Pytorch on 4 NVIDIA Tesla A100 GPUs, and a mini-batch of 128 images is assigned for each GPU. In the case of Web Face42M, we train our models (Res Net-200 backbone) using 64 NVIDIA Tesla V100 GPUs. We crop all images to 112 × 112, following the same setting as in Arc Face [1, 34]. For the backbones, we adopt Res Net-50, Res Net-100 and Res Net-200 [92] as modified in [1]. We follow [1] to employ Arc Face (s = 64 and m = 0.5) as the basic classification loss to train the Topo FR model. For the Topo FR model trained by Cos Face [2], we set the scale s to 64 and the cosine margin m of Cos Face to 0.4. To optimize the models, we use Stochastic Gradient Descent (SGD) optimizer with momentum of 0.9 for both datasets. The weight decay for MS1MV2 is set to 5e-4 and 1e-4 for Glint360K. The initial learning rate is set to 0.1 for both datesets. In terms of the balance coefficient α, we choose α = 0.1 for experiments on R50 Topo FR, and α = 0.05 for experiments on R100 Topo FR and R200 Topo FR. During training, we apply RSP mechanism with a certain probability. Specially, for an original input sample x, the probability of it undergoing RSP is ξ, and the probability of it remaining unchanged is 1 − ξ. For the hyper-parameter ξ, we choose ξ = 0.2.