reproducibilityindex.ai

Say Anything with Any Style

Authors: Shuai Tan, Bin Ji, Yu Ding, Ye Pan

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our approach surpasses state-of-the-art methods in terms of both lip-synchronization and stylized expression.
Researcher Affiliation	Collaboration	Shuai Tan1, Bin Ji1, Yu Ding2, Ye Pan1* 1 Shanghai Jiao Tong University 2 Virtual Human Group, Netease Fuxi AI Lab
Pseudocode	No	The paper describes methods and processes but does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an explicit statement about releasing its source code or a link to a code repository for the described methodology.
Open Datasets	Yes	Two public datasets are leveraged to train and test our proposed SAAS: MEAD (Wang et al. 2020) and HDTF (Zhang et al. 2021b).
Dataset Splits	No	The paper mentions training and testing but does not provide specific details on validation dataset splits (percentages, sample counts, or explicit methodology).
Hardware Specification	Yes	Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory.
Software Dependencies	No	The paper states 'We implement our SAAS model with Pytorch' and mentions 'Incorporating the Adaptive moment estimation (Adam) optimizer', but it does not specify version numbers for any software dependencies.
Experiment Setup	Yes	We set w = 8, T = 32, N = 500 and ds = 256. Model training and testing are conducted on 4 NVIDIA Ge Force GTX 3090 with 24GB memory. Incorporating the Adaptive moment estimation (Adam) optimizer (Kingma and Ba 2014), the style codebook Cs and Style Encoder Es are pre-trained for 24 hours. Then, we froze weights of Cs and Es, and jointly train the whole network with the learning rate of 2e-4 for 500 and 300 epochs in audio-driven and video-driven settings, respectively.