Generative Human Motion Stylization in Latent Space
Authors: chuan guo, Yuxuan Mu, Xinxin Zuo, Peng Dai, Youliang Yan, Juwei Lu, Li Cheng
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-arts in style reeanactment, content preservation, and generalization across various applications and settings. |
| Researcher Affiliation | Collaboration | 1University of Alberta, 2Noah s Ark Lab, Huawei Canada |
| Pseudocode | No | No pseudocode or algorithm block was found in the paper. |
| Open Source Code | Yes | Code and Model. The code of our approach and implemented baselines are also submitted for reference. Code and trained model will be publicly available upon acceptance. |
| Open Datasets | Yes | We adopt three datasets for comprehensive evaluation. (Aberman et al., 2020) is a widely used motion style dataset, which contains 16 distinct style labels including angry, happy, Old, etc, with total duration of 193 minute. (Xia et al., 2015) is much smaller motion style collection (25 mins) that is captured in 8 styles, with accurate action type annotation (8 actions). The other one is CMU Mocap (CMU), an unlabeled dataset with high diversity and quantity of motion data. All motion data is retargeted to the same 21-joint skeleton structure, with a 10% held-out subset for evaluation. |
| Dataset Splits | No | All motion data is retargeted to the same 21-joint skeleton structure, with a 10% held-out subset for evaluation. |
| Hardware Specification | Yes | Table 5 presents the comparisons of average time cost for a single forward pass with 160-frame motion inputs, evaluated on a single Tesla P100 16G GPU. |
| Software Dependencies | No | Our models are implemented by Pytorch. |
| Experiment Setup | Yes | The values of "lambda"l kld, "lambda"l1 and "lambda"sms are all set to 0.001, and dimension Dz of z is 512. During training our latent stylization network, the value of "lambda"hsa, "lambda"cyc and "lambda"kl are (1, 0.1, 0.1) and (0.1, 1, 0.01) in supervised setting and unsupervised setting, respectively. |