reproducibilityindex.ai

Loss Function Search for Face Recognition

Authors: Xiaobo Wang, Shuo Wang, Cheng Chi, Shifeng Zhang, Tao Mei

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on the face recognition benchmarks, including LFW, SLLFW, CALFW, CPLFW, Age DB, CFP, RFW, Mega Face and Trillion Pairs, which have veriﬁed the superiority of our new approach over the baseline Softmax loss, the handcrafted heuristic margin-based Softmax losses, and the Auto ML method AM-LFS.
Researcher Affiliation	Collaboration	1JD AI Research 2Institute of Automation, Chinese Academy of Science. Correspondence to: Shifeng Zhang <shifeng.zhang@nlpr.ia.ac.cn>.
Pseudocode	Yes	Algorithm 1 Search-Softmax
Open Source Code	Yes	To allow more experimental veriﬁcation, our code is available at http://www.cbsr.ia.ac.cn/users/xiaobowang/.
Open Datasets	Yes	This paper involves two popular training datasets, including CASIA-Web Face (Yi et al., 2014) and MS-Celeb-1M (Guo et al., 2016).
Dataset Splits	Yes	In the outer level, we optimize the modulating factor a by REINFORCE (Williams, 1992) with rewards (i.e., accuracy on LFW) from a ﬁxed number of sampled models.
Hardware Specification	Yes	For all the datasets, each sampled model is trained with 2 P40 GPUs, so a total of 8 GPUs are used.
Software Dependencies	No	The paper states: 'All experiments in this paper are implemented by Pytorch (Paszke et al., 2019).' However, a specific version number for Pytorch is not provided.
Experiment Setup	Yes	The total batch size is 128. The weight decay is set to 0.0005 and the momentum is 0.9. The learning rate is initially 0.1. For the CASIA-Web Face-R, we empirically divide the learning rate by 10 at 9, 18, 26 epochs and ﬁnish the training process at 30 epoch. For the MS-Celeb-1M-v1c-R, we divide the learning rate by 10 at 4, 8, 10 epochs, and ﬁnish the training process at 12 epoch. ... We use Adam optimizer with a learning rate of η = 0.05 and set σ = 0.2 for updating the distribution parameter µ.