MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes

Authors: Zhenhui Ye, Tianyun Zhong, Yi Ren, Ziyue Jiang, Jiawei Huang, Rongjie Huang, Jinglin Liu, Jinzheng He, Chen Zhang, Zehan Wang, Xize Cheng, Xiang Yin, Zhou Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our Mimic Talk surpasses previous baselines regarding video quality, efficiency, and expressiveness.
Researcher Affiliation Collaboration Zhenhui Ye 1,2 Tianyun Zhong 1,2 Yi Ren 2 Ziyue Jiang 1,2 Jiawei Huang 1,2 Rongjie Huang 1 Jinglin liu 2 Jinzheng He 1 Chen Zhang 2 Zehan Wang 1 Xize Chen 1 Xiang Yin 2 Zhou Zhao 1 1Zhejiang University, 2Byte Dance
Pseudocode No The paper describes methods through network diagrams and mathematical equations, but does not include structured pseudocode or algorithm blocks labeled
Open Source Code Yes Source code and video samples are available at https://mimictalk.github.io.
Open Datasets Yes To train the ICS-A2M model, we use a large-scale lip-reading dataset, voxceleb2 (Chung et al., 2018), which consists of about 2,000 hours videos from 6,112 celebrities.
Dataset Splits Yes For training efficiency, as shown in Fig. 4(a), we adapt the model on a 180-second-long clip as the training data and use the lasting 10-second clip as the validation set.
Hardware Specification Yes For the SD-Hybrid adaptation, we trained the model on 1 Nvidia A100 GPU, with a batch size of 1 and total iterations of 2,000, requiring about 8 GB of GPU memory and 0.26 hours. Regarding the ICS-A2M model, we trained it on 4 Nvidia A100 GPUs, with a batch size of 20,000 mel frames per GPU.
Software Dependencies No The paper does not provide specific version numbers for software libraries or frameworks used in the experiments.
Experiment Setup Yes We set the learning rate to 0.001, λLPIPS = 0.2, λID = 0.1. We provide detailed hyper-parameter settings about the model configuration in Table 6.