Enhanced Fine-Grained Motion Diffusion for Text-Driven Human Motion Synthesis

Authors: Dong Wei, Xiaoning Sun, Huaijiang Sun, Shengxiang Hu, Bin Li, Weiqing Li, Jianfeng Lu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments In this section, to evaluate our proposed model, we introduce datasets, evaluation metrics, implementation details, and comparable baselines. Results, visualized comparisons with discussion are followed. Ablation study is conducted to show the impact of each component.
Researcher Affiliation Collaboration 1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China 2Tianjin Ai Forward Science and Technology Co., Ltd., Tianjin, China {csdwei, sunxiaoning, sunhuaijiang, hushengxiang, li weiqing, lujf}@njust.edu.cn, libin@aiforward.com
Pseudocode No The paper describes its proposed method and network architecture, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code No The paper does not contain any statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes KIT Motion-Language dataset (Plappert, Mandery, and Asfour 2016) is a text-to-motion dataset... Human ML3D (Guo et al. 2022) is a new dataset...
Dataset Splits No The paper states 'Dataset split procedure is consistent with prior (Ahuja and Morency 2019; Petrovich, Black, and Varol 2022; Kim, Kim, and Choi 2023)' but does not explicitly provide the training, validation, and test dataset splits by percentage, sample counts, or specific predefined split names within the paper.
Hardware Specification Yes Our model is trained under Pytorch framework using NVIDIA RTX 3090, with batch size 64 for 500K steps on Human ML3D and 200K steps on KIT.
Software Dependencies No The paper mentions training under 'Pytorch framework' and using a 'frozen CLIP-Vi T-B/32 (Radford et al. 2021) model' but does not specify version numbers for PyTorch or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes As for the diffusion model, the number of diffusion steps t is 1,000 with cosine beta scheduling following (Tevet et al. 2023; Kim, Kim, and Choi 2023)... During inference, the transition guidance scale r and classifier-free guidance scale s are set as 100.0 and 2.5, respectively. We use Adam optimizer with learning rate set to 0.0001. Our model is trained under Pytorch framework using NVIDIA RTX 3090, with batch size 64 for 500K steps on Human ML3D and 200K steps on KIT.