Learning Invariant Deep Representation for NIR-VIS Face Recognition
Authors: Ran He, Xiang Wu, Zhenan Sun, Tieniu Tan
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations show that our method achieves 94% verification rate at FAR=0.1% on the challenging CASIA NIR-VIS 2.0 face recognition dataset. Compared with state-of-the-art methods, it reduces the error rate by 58% only with a compact 64-D representation. |
| Researcher Affiliation | Academia | Ran He, Xiang Wu, Zhenan Sun, Tieniu Tan National Laboratory of Pattern Recognition, CASIA Center for Research on Intelligent Perception and Computing, CASIA Center for Excellence in Brain Science and Intelligence Technology, CAS University of Chinese Academy of Sciences, Beijing 100190, China. {rhe,znsun,tnt}@nlpr.ia.ac.cn, alfredxiangwu@gmail.com |
| Pseudocode | Yes | Algorithm 1: Training the IDR network. |
| Open Source Code | Yes | We employ the lightened CNN B network (Wu et al. 2015) as the basic network1. 1https://github.com/AlfredXiangWu/face_verification_experiment |
| Open Datasets | Yes | The CASIA NIR-VIS 2.0 Face Database is widely used in NIR-VIS heterogeneous face evaluations because it is the largest public and most challenging NIR-VIS database. (Li et al. 2013). The MS-Celeb-1M dataset (Guo et al. 2016), which contains totally 8.5M images for about 100K identities, is employed to train the basic network. |
| Dataset Splits | Yes | View 1 is used for super-parameters adjustment, and View 2 is used for training and testing. For a fair comparison with other results, we follow the standard protocol in View 2. There are 10-fold experiments in View 2. Each fold contains a collection of training and testing lists. Nearly equal numbers of identities are included in the training and testing sets, and are kept disjoint from each other. For each training fold, there are around 2,500 VIS images and around 6,100 NIR images from around 360 subjects. These subjects are mutually exclusive from the 358 subjects in the testing set. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., CPU, GPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a "lightened CNN B network" and "Softmax is used as the loss function" but does not provide specific version numbers for any software, libraries, or frameworks. |
| Experiment Setup | Yes | The training VIS face images are normalized and cropped to 144x144 according to five facial points. To enrich the input data, we randomly cropped the input images into 128x128. Dropout ratio is set to 0.7 for fully connected layer and the learning rate is set to 1e-3 initially and reduced to 1e-5 for 4,000,000 iterations. The learning rate of the IDR network is set to 1e-4 initially and reduced to 1e-6 gradually for around 100,000 iterations. |