Enhanced Fine-Grained Motion Diffusion for Text-Driven Human Motion Synthesis
Authors: Dong Wei, Xiaoning Sun, Huaijiang Sun, Shengxiang Hu, Bin Li, Weiqing Li, Jianfeng Lu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments In this section, to evaluate our proposed model, we introduce datasets, evaluation metrics, implementation details, and comparable baselines. Results, visualized comparisons with discussion are followed. Ablation study is conducted to show the impact of each component. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China 2Tianjin Ai Forward Science and Technology Co., Ltd., Tianjin, China {csdwei, sunxiaoning, sunhuaijiang, hushengxiang, li weiqing, lujf}@njust.edu.cn, libin@aiforward.com |
| Pseudocode | No | The paper describes its proposed method and network architecture, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | No | The paper does not contain any statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | KIT Motion-Language dataset (Plappert, Mandery, and Asfour 2016) is a text-to-motion dataset... Human ML3D (Guo et al. 2022) is a new dataset... |
| Dataset Splits | No | The paper states 'Dataset split procedure is consistent with prior (Ahuja and Morency 2019; Petrovich, Black, and Varol 2022; Kim, Kim, and Choi 2023)' but does not explicitly provide the training, validation, and test dataset splits by percentage, sample counts, or specific predefined split names within the paper. |
| Hardware Specification | Yes | Our model is trained under Pytorch framework using NVIDIA RTX 3090, with batch size 64 for 500K steps on Human ML3D and 200K steps on KIT. |
| Software Dependencies | No | The paper mentions training under 'Pytorch framework' and using a 'frozen CLIP-Vi T-B/32 (Radford et al. 2021) model' but does not specify version numbers for PyTorch or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | As for the diffusion model, the number of diffusion steps t is 1,000 with cosine beta scheduling following (Tevet et al. 2023; Kim, Kim, and Choi 2023)... During inference, the transition guidance scale r and classifier-free guidance scale s are set as 100.0 and 2.5, respectively. We use Adam optimizer with learning rate set to 0.0001. Our model is trained under Pytorch framework using NVIDIA RTX 3090, with batch size 64 for 500K steps on Human ML3D and 200K steps on KIT. |