reproducibilityindex.ai

ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment

Authors: Yicheng Zhong, Huawei Wei, Peiji Yang, Zhisheng Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments illustrate that our method accomplishes expressive facial animation generation and offers enhanced flexibility in effectively conveying the desired style.
Researcher Affiliation	Industry	1 Tencent Technology (Shenzhen) Co.Ltd {ajaxzhong, huaweiwei, peijiyang, plorywang}@tencent.com
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not contain any statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	MEAD is a talking-face video corpus featuring 60 actors talking with 8 different emotions at 3 different intensity levels. ... (Wang et al. 2020) ... BEAT comprises 76 hours of speech data, paired with 52D facial blendshape weights. ... (Liu et al. 2022)
Dataset Splits	No	For TEAD and MEAD-3D, the paper states, 'We use 90% of the data for training and the remaining 10% for testing'. It does not specify a separate validation split or its size for any of the datasets used.
Hardware Specification	Yes	The entire framework is trained using the Adam optimizer (Kingma and Ba 2014) on a single A100 GPU.
Software Dependencies	No	The paper mentions 'Our framework is implemented by Pytorch(Paszke et al. 2019)' but does not provide specific version numbers for PyTorch or other key libraries used (e.g., CLIP-Vi TB/32, Adam optimizer).
Experiment Setup	Yes	Exp CLIP is trained with a learning rate of 1e-5 and a batch size of 256. ... An 8-layer transformer decoder is used... Each training sample has a duration of 64 frames with FPS=15.