FG-EmoTalk: Talking Head Video Generation with Fine-Grained Controllable Facial Expressions
Authors: Zhaoxu Sun, Yuze Xuan, Fang Liu, Yang Xiang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show our method achieves fine-grained expression control, produces high-quality talking head videos and outperforms baseline methods. |
| Researcher Affiliation | Collaboration | Zhaoxu Sun1, Yuze Xuan1, Fang Liu2*, Yang Xiang1 1Xiaobing.ai 2State Key Laboratory of Media Convergence and Communication, Communication University of China |
| Pseudocode | No | The paper describes its method in text and with diagrams (Figure 2), but does not provide a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code for the methodology described. |
| Open Datasets | Yes | We use the HDTF (Zhang et al. 2021b) and Celeb V-HQ (Zhu et al. 2022) datasets... Moreover, the MEAD dataset (Wang et al. 2020)... We used the DISFA dataset (Mavadati et al. 2013)... |
| Dataset Splits | No | The paper mentions selecting 2,000 videos from HDTF not in training set for evaluation and 2,000 videos from MEAD for testing. It does not explicitly specify a validation set or clear percentages for training, validation, and test splits for all datasets, nor the training set size. |
| Hardware Specification | Yes | All experiments were conducted with 4 NVIDIA Tesla A10 GPUs. |
| Software Dependencies | No | The paper states 'We implemented our framework in Pytorch' but does not provide specific version numbers for PyTorch or any other software dependencies like Wav2Vec2, Gated-GCN, or GFP-GAN. |
| Experiment Setup | Yes | We used the Adam optimizer with a learning rate of 0.002. The hyperparameters λapp, λexp, and λper were set to 100.0, 100.0, and 10.0 respectively in the training stage. |