Adversarial Discriminative Heterogeneous Face Recognition
Authors: Lingxiao Song, Man Zhang, Xiang Wu, Ran He
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on three NIR-VIS databases show that our proposed approach outperforms state-of-the-art HFR methods, without requiring of complex network or large-scale training dataset. |
| Researcher Affiliation | Academia | National Laboratory of Pattern Recognition, CASIA Center for Research on Intelligent Perception and Computing, CASIA Center for Excellence in Brain Science and Intelligence Technology, CAS |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code of the described methodology. |
| Open Datasets | Yes | The CASIA NIR-VIS 2.0 face database (Li et al. 2013).The BUAA-Vis Nir face database (Huang, Sun, and Wang 2012).The Oulu-CASIA NIR-VIS facial expression database (Chen et al. 2009).MS-Celeb-1M dataset (Guo et al. 2016) |
| Dataset Splits | Yes | In our experiments, we follow the View 2 of the standard protocol defined in (Li et al. 2013), which is used for performance evaluation. There are 10-fold experiments in View 2, where each fold contains non-overlapped training and testing lists. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models or types) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | All the face images are normalized by similarity transformation using the locations of two eyes, and then cropped to 144 144 size, of which 128 128 sized sub images are selected by random cropping in training and center cropping in testing. For the local-path, 32 32 patches are cropped around two eyes, and then flipped to the same side. As mentioned above, in the cross-spectral hallucination module, images are encoded in YCb Cr space. In the feature extraction step, grayscale images are used as input. Our cross-spectral hallucination networks take the architecture of Res Net (He et al. 2016), where the global-path is comprised of 6 residual blocks and the local-path contains 3 residual blocks. Output of the local-path is feed to the global-path before the last block. In the adversarial discriminative feature learning module, we employ the model-B of the Light CNN (Wu, He, and Sun 2015) as our basic model, which includes 9 convolution layers, 4 max-pooling and one fully-connected layer. Parameters of the convolution layers are shared across the VIS and NIR channels as shown in Fig. 1. The output feature dimension of our approach is 256 |