Facial Landmarks Detection by Self-Iterative Regression Based Landmarks-Attention Network

Authors: Tao Hu, Honggang Qi, Jizheng Xu, Qingming Huang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we perform experiments to demonstrate the effectiveness of the proposed SIR compared to state-of-the-art methods. Specifically, we evaluate the proposed method model by (1) comparing the performance of SIR vs. stateof-the-art and baseline cascaded regression; (2) comparing the number of model parameters and memory storage of pretrain models; and (3) studying the effect of the proposed feature extraction network(LAN), the number of iteration times and sampling space parameter.
Researcher Affiliation Collaboration Tao Hu,1 Honggang Qi,1 Jizheng Xu,2 Qingming Huang1 1 University of Chinese Academy of Sciences, Beijing, China 2 Microsoft Research Asia, Beijing, China
Pseudocode Yes Algorithm 1 Sampling process of SIR
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include any links to a code repository.
Open Datasets Yes The 300-W dataset is short for 300 faces in-the-wild (Sagonas et al. 2016), which is designed for evaluating the performance of facial landmarks detection. The training set (3, 148 faces in total) consists of AFW dataset (Ramanan 2012), HELEN training set (Le et al. 2012) and LFPW training set (Belhumeur et al. 2011).
Dataset Splits Yes The training set (3, 148 faces in total) consists of AFW dataset (Ramanan 2012), HELEN training set (Le et al. 2012) and LFPW training set (Belhumeur et al. 2011). Two testing sets are established, i.e., public testing set (689 faces in total) including HELEN testing set (Le et al. 2012), LFPW testing set (Belhumeur et al. 2011) and IBUG dataset (Sagonas et al. 2016); and competition testing set (600 faces in total) including 300 indoor and 300 outdoor faces images.
Hardware Specification Yes We perform the experiments based on a machine with Core i7-5930k CPU, 32 GB memory and GTX 1080 GPU with 8G video memory.
Software Dependencies No The paper mentions software components like 'Rectified Linear Unit (Re LU)' as an activation function and 'Adadelta (Zeiler 2012)' as an optimizer, but it does not specify version numbers for these or any other software libraries or frameworks used in the implementation.
Experiment Setup Yes The detected faces are resized into 256 256 and the location patch size is 57 57. For CNN structure, the Rectified Linear Unit (Re LU) is adopted as the activation function, and the optimizer is the Adadelta (Zeiler 2012) approach, learning rate is set to 0.1 and weight decay is set to 1e 4.