Dual Variational Generation for Low Shot Heterogeneous Face Recognition

Authors: Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on four HFR databases show that our method can significantly improve state-of-the-art results.
Researcher Affiliation Academia Chaoyou Fu1,2 , Xiang Wu1 , Yibo Hu1, Huaibo Huang1, Ran He1,2,3 1NLPR & CRIPAC, CASIA 2University of Chinese Academy of Sciences 3Center for Excellence in Brain Science and Intelligence Technology, CAS
Pseudocode No The paper describes the method using mathematical equations and textual descriptions but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing open-source code or a link to a code repository.
Open Datasets Yes Three NIR-VIS heterogeneous face databases and one Sketch-Photo heterogeneous face database are used to evaluate our proposed method. For the NIR-VIS face recognition, following [35], we report Rank-1 accuracy and verification rate (VR)@false accept rate (FAR) for the CASIA NIR-VIS 2.0 [24], the Oulu-CASIA NIR-VIS [18] and the BUAA-Vis Nir Face [14] databases. Note that, for the Oulu-CASIA NIR-VIS database, there are only 20 subjects are selected as the training set. In addition, the IIIT-D Viewed Sketch database [1] is employed for the Sketch-Photo face recognition. Due to the few number of images in the IIIT-D Viewed Sketch database, following the protocols of [3], we use the CUHK Face Sketch FERET (CUFSF) [37] as the training set and report the Rank-1 accuracy and VR@FAR=1% for comparisons.
Dataset Splits No The paper states training sets for some databases (e.g., '20 subjects are selected as the training set' for Oulu-CASIA NIR-VIS, and '200 persons are used as the training set and the rest 137 persons are the testing set' for Multi-PIE) and mentions evaluation metrics for testing, but it does not provide explicit training/validation/test splits with percentages or sample counts for all datasets to ensure full reproducibility of data partitioning.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper mentions optimizers (Adam, SGD) and uses specific models (Light CNN-9, Light CNN-29) but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For the dual variational generation, the architectures of the encoder and decoder networks are the same as [15], and the architecture of our discriminator is the same as [31]. These networks are trained using Adam optimizer with a fixed rate of 0.0002. Other parameters λ1, λ2, λ3 and λ4 in Eq. (9) are set to 50, 5, 1000 and 0.2, respectively. For the heterogeneous face recognition, we utilize both Light CNN-9 and Light CNN-29 [34] as the backbones. The models are pre-trained on the MS-Celeb-1M database [9] and fine-tuned on the HFR training sets. All the face images are aligned to 144 144 and randomly cropped to 128 128 as the input for training. Stochastic gradient descent (SGD) is used as the optimizer, where the momentum is set to 0.9 and weight decay is set to 5e-4. The learning rate is set to 1e-3 initially and reduced to 5e-4 gradually. The batch size is set to 64 and the dropout ratio is 0.5. The trade-off parameters α1 in Eq. (12) is set to 0.001 during training.