reproducibilityindex.ai

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

Authors: Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang9299-9306

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that the proposed approach generates realistic talking face sequences on arbitrary subjects with much clearer lip motion patterns than previous work.
Researcher Affiliation	Academia	The Chinese University of Hong Kong, Hong Kong, China {zhouhang@link, yuliu@ee, zwliu@ie, xgwang@ee}.cuhk.edu.hk, pluo.lhi@gmail.com
Pseudocode	No	The paper describes the model architecture and training process in detail with figures and formulas, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper links to a project page 'https://liuziwei7.github.io/projects/Talking Face' but this page does not explicitly provide access to the source code for the described methodology.
Open Datasets	Yes	Our model is trained and evaluated on the LRW dataset (Chung and Zisserman 2016a)... the identity-preserving module of the network is trained on a subset of the MS-Celeb-1M dataset (Guo et al. 2016).
Dataset Splits	Yes	For each class, there are more than 800 training samples and 50 validation/test samples.
Hardware Specification	Yes	The batch size is set to be 18 with 1e-4 learning rate and trained on 6 Titan X GPUs.
Software Dependencies	No	The paper states 'We implemented DAVS using Pytorch.' but does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	The batch size is set to be 18 with 1e-4 learning rate and trained on 6 Titan X GPUs. It takes about 4 epochs for the audio-visual speech recognition and person-identity recognition to converge and another 5 epochs for further tuning the generator.