Deeply-learned Hybrid Representations for Facial Age Estimation
Authors: Zichang Tan, Yang Yang, Jun Wan, Guodong Guo, Stan Z. Li
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on five age benchmark datasets, including Web-Face Age, Morph, FG-NET, CACD and Chalearn LAP 2015, show that the proposed method outperforms the state-of-the-art approaches significantly. |
| Researcher Affiliation | Collaboration | 1CBSR&NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Institute of Deep Learning, Baidu Research, Beijing, China 4National Engineering Laboratory for Deep Learning Technology and Application, Beijing, China 5Faculty of Information Technology, Macau University of Science and Technology, Macau, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating the release of source code for the described methodology. |
| Open Datasets | Yes | We evaluate the proposed method on six benchmark datasets: Web-Face Age, Morph II [Ricanek and Tesafaye, 2006], CACD [Chen et al., 2014], FG-NET3 and Chalearn LAP 2015 [Escalera et al., 2015]. Besides, we also introduce the IMDB-WIKI dataset [Rothe et al., 2018] used for pretraining. |
| Dataset Splits | Yes | Morph II. This dataset contains 55,134 face images in total. In our experiments, three typical protocols are employed for evaluation: (1) 80-20 protocol: as in the works [Niu et al., 2016], the dataset is randomly divided to two parts, where 80% for training and 20% for test. (2) Partial 80-20 protocol: Following the works [Rothe et al., 2018; Tan et al., 2018], a subset of 5493 face images from Caucasian descent are used, which reduces the cross-race influence. Then, we randomly split the subset into two parts: 80% images for training and the left for test. (3) S1-S2-S3 protocol: the dataset is split into three non-overlapped subsets S1, S2 and S3 followed by the work [Tan et al., 2018]. All experiments are repeated twice and the averaging results are used for evaluation: 1. training with S1 and test with S2+S3; 2. training with S2 and test with S1+S3. FG-NET. We follow the previous works [Rothe et al., 2018] to take leave-one person-out (LOPO) strategy for evaluation. CACD. According to the previous work [Tan et al., 2018], we employ the subset of 1,800 celebrities with less precise labeling for training, and 80 and 120 cleaned celebrities for validation and test, respectively. Chalearn LAP 2015. This dataset includes training, validation and test subsets with 2476, 1136 and 1079 images, respectively. |
| Hardware Specification | Yes | All models are implemented with Py Torch on GTX 1080Ti GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the implementation framework but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | Pre-processing and Augmentation. The images are aligned with the eyes center and the upper lip two landmarks according to the work [Tan et al., 2018]. To obtain the features with large size for ARP, we adopt the images with the size of 256 x 256 as the inputs. Following the settings in [Gao et al., 2018], we augment the face images with random horizontal flipping, scaling, rotation and translation in the training stage. Training Details. All networks are first pretrained on Image Net and optimized by SGD with Nesterov momentum. The weight decay and the momentum are set to 0.0005 and 0.9. The initial learning rate is set to 0.01 and reduced by a factor of 10 with number of iteration increases. |