Joint Super-Resolution and Alignment of Tiny Faces

Authors: Yu Yin, Joseph Robinson, Yulun Zhang, Yun Fu12693-12700

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that the proposed model significantly outperforms the state-of-the-art in both landmark localization and SR of faces. We show a large improvement for landmark localization of tiny faces (i.e., 16 16). Furthermore, the proposed framework yields comparable results for landmark localization on low-resolution (LR) faces (i.e., 64 64) to existing methods on HR (i.e., 256 256).
Researcher Affiliation Academia Yu Yin,1 Joseph P. Robinson,1 Yulun Zhang,1 Yun Fu1,2 1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 2Khoury College of Computer Science, Northeastern University, Boston, MA
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1The code is available at: https://github.com/Yu Yin1/JASRNet.
Open Datasets Yes 300W (Sagonas et al. 2013; 2016) consists of 3,837 face images with 68 landmarks. We used the same training set as (Lv et al. 2017; Zhu et al. 2015). Subsets of 300W were evaluated: common and challenge, and full. AFLW (Koestinger et al. 2011) consists of 24,386 faces, each with 21 landmarks. The dataset was split into 20,000 faces for training and the rest (i.e., 4,386) for testing (Dong et al. 2018). Also, the left and right ears were ignored, leaving up to 19 landmarks per face sample. HELEN (Le et al. 2012) contains 2,330 images. The annotation of all 194 landmarks were used as facial prior information. We followed (Chen et al. 2018) to use the last 50 images for testing and the rest for training. LFW (Huang et al. 2007; Learned-Miller 2014) contains 13,233 face images collected from 5,750 people. Each image is labeled with the name of the person pictured. Hence, it will also be used to evaluate the recognition capabilities of super-resolved images. Note that this dataset was only used for testing.
Dataset Splits No While the paper specifies training and testing splits for AFLW and HELEN, and refers to existing training sets for 300W, it does not explicitly mention or detail a separate 'validation' dataset split for hyperparameter tuning or early stopping.
Hardware Specification Yes Training took about 7 hours on Helen with a Nvidia TITAN-XP GPU.
Software Dependencies No Implementation was done using Py Torch. The paper mentions PyTorch but does not provide a specific version number for it or any other software dependencies.
Experiment Setup Yes Optimization was done with ADAM with a learning rate of 5.0e-5 that dropped 0.5 at 20th and 30th epochs. The model was trained with a batch size of 8 and for a total epoch of 40 epochs.