reproducibilityindex.ai

MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes

Authors: Zhenhui Ye, Tianyun Zhong, Yi Ren, Ziyue Jiang, Jiawei Huang, Rongjie Huang, Jinglin Liu, Jinzheng He, Chen Zhang, Zehan Wang, Xize Cheng, Xiang Yin, Zhou Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our Mimic Talk surpasses previous baselines regarding video quality, efﬁciency, and expressiveness.
Researcher Affiliation	Collaboration	Zhenhui Ye 1,2 Tianyun Zhong 1,2 Yi Ren 2 Ziyue Jiang 1,2 Jiawei Huang 1,2 Rongjie Huang 1 Jinglin liu 2 Jinzheng He 1 Chen Zhang 2 Zehan Wang 1 Xize Chen 1 Xiang Yin 2 Zhou Zhao 1 1Zhejiang University, 2Byte Dance
Pseudocode	No	The paper describes methods through network diagrams and mathematical equations, but does not include structured pseudocode or algorithm blocks labeled
Open Source Code	Yes	Source code and video samples are available at https://mimictalk.github.io.
Open Datasets	Yes	To train the ICS-A2M model, we use a large-scale lip-reading dataset, voxceleb2 (Chung et al., 2018), which consists of about 2,000 hours videos from 6,112 celebrities.
Dataset Splits	Yes	For training efﬁciency, as shown in Fig. 4(a), we adapt the model on a 180-second-long clip as the training data and use the lasting 10-second clip as the validation set.
Hardware Specification	Yes	For the SD-Hybrid adaptation, we trained the model on 1 Nvidia A100 GPU, with a batch size of 1 and total iterations of 2,000, requiring about 8 GB of GPU memory and 0.26 hours. Regarding the ICS-A2M model, we trained it on 4 Nvidia A100 GPUs, with a batch size of 20,000 mel frames per GPU.
Software Dependencies	No	The paper does not provide specific version numbers for software libraries or frameworks used in the experiments.
Experiment Setup	Yes	We set the learning rate to 0.001, λLPIPS = 0.2, λID = 0.1. We provide detailed hyper-parameter settings about the model conﬁguration in Table 6.