Optimal Transport-based Identity Matching for Identity-invariant Facial Expression Recognition
Authors: Daeha Kim, Byung Cheol Song
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive simulations prove that the proposed FER method improves the PCC/CCC performance by up to 10% or more compared to the runner-up on wild datasets. The source code and software demo are available at https://github.com/kdhht2334/ELIM_FER. Contributions of this paper are summarized as follows: (II) The state-of-the-art (SOTA) performance of ELIM is verified through extensive experiments on various real-world datasets. Especially, ELIM works more accurately compared to prior arts even for samples in which inconsistency between facial expressions and the emotion label predictions exists (see Fig. 3). 5 Experiments 5.1 Datasets and Implementation Details 5.2 Idea Verification of ELIM 5.3 Comparison with State-of-the-art Methods 5.4 Ablation Studies and Discussions |
| Researcher Affiliation | Academia | Daeha Kim Inha University kdhht5022@gmail.com Byung Cheol Song Inha University bcsong@inha.ac.kr |
| Pseudocode | Yes | Algorithm 1 Training Procedure of ELIM |
| Open Source Code | Yes | The source code and software demo are available at https://github.com/kdhht2334/ELIM_FER. A link to the full source code is added to the camera-ready version. |
| Open Datasets | Yes | We adopted public databases only for research purposes, and informed consent was obtained if necessary. AFEW-VA [28] derived from the AFEW dataset [10] consists of about 600 short video clips annotated with VA labels frame by frame. Aff-wild [50] is the first large-scale in-the-wild VA dataset that collects the reactions of people who watch movies or TV shows. Aff-wild2, which added videos of spontaneous facial behaviors to Aff-wild, was also released [25]. |
| Dataset Splits | Yes | Evaluation was performed through cross validation at a ratio of 5:1. Since the test data of Aff-wild was not disclosed, this paper adopted the sampled train set for evaluation purpose in the same way as previous works [17, 22]. Table 3: Results on the validation set of Affwild2. |
| Hardware Specification | Yes | All models were implemented in Py Torch, and the following experiments were performed on Intel Xeon CPU and NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | The paper states 'All models were implemented in Py Torch' but does not specify the version number of PyTorch or any other software dependencies with their versions. |
| Experiment Setup | Yes | The mini-batch size for those models was set to 512, 256, and 128, respectively (cf. Appendix B for model details). Learnable parameters (ϕ, θ) were optimized with Adam optimizer [23]. The learning rate (LR) of ϕ and θ was set to 5e-5 and 1e-5, respectively. LR decreases 0.8-fold at the initial 5K iterations and 0.8-fold every 20K iterations. ϵ of Eq. 4 is 1e-6. At each iteration, the number of ID, i.e., N was set to 10 by default. The size d of z is 64, and the size k of Gumbel-based sampling in Sec. 4.3 is 10. |