Emotion-Controllable Generalized Talking Face Generation

Authors: Sanjana Sinha, Sandika Biswas, Ravindra Yadav, Brojeshwar Bhowmick

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present the quantitative results in Table 2. The emotional talking face SOTA methods MEAD, EVP, [Eskimez et al., 2020; Vougioukas et al., 2019] are dataset-specific and do not generalize well for arbitrary identities outside the training dataset. For a fair comparison, the evaluation metrics of SOTA methods have been reported for the respective dataset on which they were trained. An ablation study of GL is presented in Table 3. An ablation study of GT is presented in Table 4.
Researcher Affiliation Collaboration Sanjana Sinha1 , Sandika Biswas1 , Ravindra Yadav2 and Brojeshwar Bhowmick1 1TCS Research, India 2IIT Kanpur, India
Pseudocode No The paper describes methods in text and uses diagrams (Fig. 2) but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper references GitHub links for other methods (e.g., '1 https://github.com/uni Bruce/Mead', '2 https://github.com/jixinya/EVP') but does not provide a link or explicit statement about its own source code being available.
Open Datasets Yes We use 3 emotional audio-visual datasets MEAD [Wang et al., 2020], CREMA-D [Cao et al., 2014], and RAVDESS [Livingstone and Russo, 2018] for our experiments.
Dataset Splits No The paper mentions 'experimentally set using validation data' but does not specify the explicit percentages or counts for training, validation, and test splits.
Hardware Specification Yes We train both GL and GT using Pytorch on NVIDIA Quadro P5000 GPUs (16 GB) using Adam Optimizer, with a learning rate of 2e 4.
Software Dependencies No The paper mentions 'Pytorch' but does not provide a specific version number or other software dependencies with version numbers.
Experiment Setup Yes We train both GL and GT using Pytorch on NVIDIA Quadro P5000 GPUs (16 GB) using Adam Optimizer, with a learning rate of 2e 4. Training of GL takes around a day with batch size 256 (2GB GPU usage), and the training of GT takes around 7 days (batch size 4 on 16GB GPU).