reproducibilityindex.ai

FG-EmoTalk: Talking Head Video Generation with Fine-Grained Controllable Facial Expressions

Authors: Zhaoxu Sun, Yuze Xuan, Fang Liu, Yang Xiang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show our method achieves fine-grained expression control, produces high-quality talking head videos and outperforms baseline methods.
Researcher Affiliation	Collaboration	Zhaoxu Sun1, Yuze Xuan1, Fang Liu2*, Yang Xiang1 1Xiaobing.ai 2State Key Laboratory of Media Convergence and Communication, Communication University of China
Pseudocode	No	The paper describes its method in text and with diagrams (Figure 2), but does not provide a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code for the methodology described.
Open Datasets	Yes	We use the HDTF (Zhang et al. 2021b) and Celeb V-HQ (Zhu et al. 2022) datasets... Moreover, the MEAD dataset (Wang et al. 2020)... We used the DISFA dataset (Mavadati et al. 2013)...
Dataset Splits	No	The paper mentions selecting 2,000 videos from HDTF not in training set for evaluation and 2,000 videos from MEAD for testing. It does not explicitly specify a validation set or clear percentages for training, validation, and test splits for all datasets, nor the training set size.
Hardware Specification	Yes	All experiments were conducted with 4 NVIDIA Tesla A10 GPUs.
Software Dependencies	No	The paper states 'We implemented our framework in Pytorch' but does not provide specific version numbers for PyTorch or any other software dependencies like Wav2Vec2, Gated-GCN, or GFP-GAN.
Experiment Setup	Yes	We used the Adam optimizer with a learning rate of 0.002. The hyperparameters λapp, λexp, and λper were set to 100.0, 100.0, and 10.0 respectively in the training stage.