Say Anything with Any Style
Authors: Shuai Tan, Bin Ji, Yu Ding, Ye Pan
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our approach surpasses state-of-the-art methods in terms of both lip-synchronization and stylized expression. |
| Researcher Affiliation | Collaboration | Shuai Tan1, Bin Ji1, Yu Ding2, Ye Pan1* 1 Shanghai Jiao Tong University 2 Virtual Human Group, Netease Fuxi AI Lab |
| Pseudocode | No | The paper describes methods and processes but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing its source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | Two public datasets are leveraged to train and test our proposed SAAS: MEAD (Wang et al. 2020) and HDTF (Zhang et al. 2021b). |
| Dataset Splits | No | The paper mentions training and testing but does not provide specific details on validation dataset splits (percentages, sample counts, or explicit methodology). |
| Hardware Specification | Yes | Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory. |
| Software Dependencies | No | The paper states 'We implement our SAAS model with Pytorch' and mentions 'Incorporating the Adaptive moment estimation (Adam) optimizer', but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | We set w = 8, T = 32, N = 500 and ds = 256. Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory. Incorporating the Adaptive moment estimation (Adam) optimizer (Kingma and Ba 2014), the style codebook Cs and Style Encoder Es are pre-trained for 24 hours. Then, we froze weights of Cs and Es, and jointly train the whole network with the learning rate of 2e-4 for 500 and 300 epochs in audio-driven and video-driven settings, respectively. |