Semantic Relationships Guided Representation Learning for Facial Action Unit Recognition
Authors: Guanbin Li, Xin Zhu, Yirui Zeng, Qing Wang, Liang Lin8594-8601
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the two public benchmarks demonstrate that our method outperforms the previous work and achieves state of the art performance. |
| Researcher Affiliation | Collaboration | 1School of Data and Computer Science, Sun Yat-sen University, China 2Dark Matter AI Inc. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper mentions 'Pytorch platform (Paszke et al. 2017)' as the basis for training models, but it does not state that the authors' own implementation code is open-source or provide a link to it. |
| Open Datasets | Yes | We evaluated our model on two spontaneous dataset: BP4D(Zhang et al. 2013) and DISFA(Mavadati et al. 2013). BP4D contains 2D and 3D facial expression data of 41 young adult subjects, including 18 male and 23 female. Each subject participated in 8 tasks, each of which corresponds to a specific expression. There are totally 146,847 face images with labeled AUs. We refer to (Zhao, Chu, and Zhang 2016) and split the dataset into 3 folds. In which, we take turns to use two folds for training and the other for testing, and report the average results of multiple tests. DISFA contains stereo videos of 27 subjects with different ethnicity. There are totally 130,815 frame images, each of which is labeled with intensities from 0-5. We choose frames with AU intensities higher or equal to C-level as positive samples, and the rest as negative samples. C is chosen as 2 in our experiment. As with BP4D, we also split the dataset into 3 folds for reliable testing. |
| Dataset Splits | Yes | We evaluated our model on two spontaneous dataset: BP4D(Zhang et al. 2013) and DISFA(Mavadati et al. 2013)... We refer to (Zhao, Chu, and Zhang 2016) and split the dataset into 3 folds. In which, we take turns to use two folds for training and the other for testing, and report the average results of multiple tests... We use an Adam optimizer with learning rate of 0.0001 and mini-batch size 64 with early stopping to train our models. |
| Hardware Specification | Yes | All models are trained using NVIDIA Ge Force GTX TITAN X GPU based on the open-source Pytorch platform (Paszke et al. 2017). |
| Software Dependencies | Yes | All models are trained using NVIDIA Ge Force GTX TITAN X GPU based on the open-source Pytorch platform (Paszke et al. 2017). |
| Experiment Setup | Yes | During the knowledge-graph construction, we set ppos as 0.2 and pneg as 0.03 in Eq (8 9), and set α = 0.002, β = 0.75, k = 2 in Eq 10. We use an Adam optimizer with learning rate of 0.0001 and mini-batch size 64 with early stopping to train our models. For F1-score, we set the threshold of prediction to 0.5. |