Semantic Relationships Guided Representation Learning for Facial Action Unit Recognition

Authors: Guanbin Li, Xin Zhu, Yirui Zeng, Qing Wang, Liang Lin8594-8601

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the two public benchmarks demonstrate that our method outperforms the previous work and achieves state of the art performance.
Researcher Affiliation Collaboration 1School of Data and Computer Science, Sun Yat-sen University, China 2Dark Matter AI Inc.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions 'Pytorch platform (Paszke et al. 2017)' as the basis for training models, but it does not state that the authors' own implementation code is open-source or provide a link to it.
Open Datasets Yes We evaluated our model on two spontaneous dataset: BP4D(Zhang et al. 2013) and DISFA(Mavadati et al. 2013). BP4D contains 2D and 3D facial expression data of 41 young adult subjects, including 18 male and 23 female. Each subject participated in 8 tasks, each of which corresponds to a specific expression. There are totally 146,847 face images with labeled AUs. We refer to (Zhao, Chu, and Zhang 2016) and split the dataset into 3 folds. In which, we take turns to use two folds for training and the other for testing, and report the average results of multiple tests. DISFA contains stereo videos of 27 subjects with different ethnicity. There are totally 130,815 frame images, each of which is labeled with intensities from 0-5. We choose frames with AU intensities higher or equal to C-level as positive samples, and the rest as negative samples. C is chosen as 2 in our experiment. As with BP4D, we also split the dataset into 3 folds for reliable testing.
Dataset Splits Yes We evaluated our model on two spontaneous dataset: BP4D(Zhang et al. 2013) and DISFA(Mavadati et al. 2013)... We refer to (Zhao, Chu, and Zhang 2016) and split the dataset into 3 folds. In which, we take turns to use two folds for training and the other for testing, and report the average results of multiple tests... We use an Adam optimizer with learning rate of 0.0001 and mini-batch size 64 with early stopping to train our models.
Hardware Specification Yes All models are trained using NVIDIA Ge Force GTX TITAN X GPU based on the open-source Pytorch platform (Paszke et al. 2017).
Software Dependencies Yes All models are trained using NVIDIA Ge Force GTX TITAN X GPU based on the open-source Pytorch platform (Paszke et al. 2017).
Experiment Setup Yes During the knowledge-graph construction, we set ppos as 0.2 and pneg as 0.03 in Eq (8 9), and set α = 0.002, β = 0.75, k = 2 in Eq 10. We use an Adam optimizer with learning rate of 0.0001 and mini-batch size 64 with early stopping to train our models. For F1-score, we set the threshold of prediction to 0.5.