reproducibilityindex.ai

Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning

Authors: Hao Zhu, Huaibo Huang, Yi Li, Aihua Zheng, Ran He

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark LRW dataset and GRID dataset transcend the state-of-the-art methods on prevalent metrics with robust high-resolution synthesizing on gender and pose variations.4 Experiments 4.1 Dataset and Metrics We evaluate our method on prevalent benchmark datasets LRW [Chung and Zisserman, 2016] and GRID [Cooke et al., 2006]. ... We use common reconstruction metrics such as PSNR and SSIM [Wang et al., 2004] to evaluate the quality of the synthesized talking faces. Furthermore, we use Landmark Distance (LMD) to evaluate the accuracy of the generated lip by calculating the landmark distance between the generated video and the original video.
Researcher Affiliation	Academia	Hao Zhu1,2 , Huaibo Huang2,3 , Yi Li2,3 , Aihua Zheng1 and Ran He2,3 1School of Computer Science and Technology, Anhui University, Hefei, China 2NLPR&CEBSIT&CRIPAC, Institute of Automation, CAS, Beijing, China 3School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences, Beijing, China haozhu96@gmail.com, {huaibo.huang,yi.li}@cripac.ia.ac.cn, ahzheng214@ahu.edu.cn, rhe@nlpr.ia.ac.cn
Pseudocode	No	The paper describes methods and architectures but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository.
Open Datasets	Yes	We evaluate our method on prevalent benchmark datasets LRW [Chung and Zisserman, 2016] and GRID [Cooke et al., 2006].
Dataset Splits	No	The paper does not explicitly provide details about training, validation, and test dataset splits (e.g., percentages, sample counts, or specific predefined split information).
Hardware Specification	No	The paper does not provide specific details regarding the hardware (e.g., GPU, CPU models, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Our full model is optimized according to the following objective function: L = LGAN + λ1Lperc + λ2Llip + λ3Lmi.Speciﬁcally, in the training stage, we start from relatively high attention (rate = 0.7 0.9), and progressively decrease it to relatively low attention (rate = 0.1 0.3), then we ﬁx the rate to 1 for the last few epochs.