Mis-Classified Vector Guided Softmax Loss for Face Recognition

Authors: Xiaobo Wang, Shifeng Zhang, Shuo Wang, Tianyu Fu, Hailin Shi, Tao Mei12241-12248

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on several benchmarks have demonstrated the effectiveness of our method over state-of-the-art alternatives.
Researcher Affiliation Collaboration 1JD AI Research, Beijing, China 2CBSR & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Pseudocode Yes Algorithm 1: MV-Softmax
Open Source Code Yes Our code is available at http://www.cbsr.ia.ac.cn/users/xiaobowang/.
Open Datasets Yes The original MS-Celeb-1M dataset (Guo et al. 2016) contains about 100K identities with 10M images. However, it consists of a great many noisy faces. Fortunately, the trillion-pairs consortium (Deepglint 2018) has made their efforts to get a high-quality version MS-Celeb1M-v1c, which is well-cleaned for training.
Dataset Splits No The paper mentions training data and test data, and evaluation protocols, but does not explicitly describe a validation split (e.g., specific percentages or counts for a validation set used during training) from the training data.
Hardware Specification Yes All the CNN models are trained with stochastic gradient descent (SGD) algorithm and are trained from scratch, with the batch size of 32 on 4 P40 or 4 V100 GPUs parallelly, total batch size 128.
Software Dependencies No All experiments in this paper are implemented by Pytorch library. However, no specific version number for PyTorch is provided.
Experiment Setup Yes All the CNN models are trained with stochastic gradient descent (SGD) algorithm and are trained from scratch, with the batch size of 32 on 4 P40 or 4 V100 GPUs parallelly, total batch size 128. The weight decay is set to 0.0005 and the momentum is 0.9. The learning rate is initially 0.1 and divided by 10 at 4, 8, 10 epochs, and we finish the training process at 12 epoch. ... we empirically set t = 0.2 for MV-Arc-Softmax-f, t = 0.3 for MV-Arc-Softmax-a, t = 0.25 for MV-AM-Softmax-f and t = 0.2 for MV-AM-Softmax-a in the subsequent experiments.