Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Say Anything with Any Style
Authors: Shuai Tan, Bin Ji, Yu Ding, Ye Pan
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our approach surpasses state-of-the-art methods in terms of both lip-synchronization and stylized expression. |
| Researcher Affiliation | Collaboration | Shuai Tan1, Bin Ji1, Yu Ding2, Ye Pan1* 1 Shanghai Jiao Tong University 2 Virtual Human Group, Netease Fuxi AI Lab |
| Pseudocode | No | The paper describes methods and processes but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing its source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | Two public datasets are leveraged to train and test our proposed SAAS: MEAD (Wang et al. 2020) and HDTF (Zhang et al. 2021b). |
| Dataset Splits | No | The paper mentions training and testing but does not provide specific details on validation dataset splits (percentages, sample counts, or explicit methodology). |
| Hardware Specification | Yes | Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory. |
| Software Dependencies | No | The paper states 'We implement our SAAS model with Pytorch' and mentions 'Incorporating the Adaptive moment estimation (Adam) optimizer', but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | We set w = 8, T = 32, N = 500 and ds = 256. Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory. Incorporating the Adaptive moment estimation (Adam) optimizer (Kingma and Ba 2014), the style codebook Cs and Style Encoder Es are pre-trained for 24 hours. Then, we froze weights of Cs and Es, and jointly train the whole network with the learning rate of 2e-4 for 500 and 300 epochs in audio-driven and video-driven settings, respectively. |