Human Motion Diffusion as a Generative Prior
Authors: Yoni Shafir, Guy Tevet, Roy Kapon, Amit Haim Bermano
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks. |
| Researcher Affiliation | Academia | Yonatan Shafir , Guy Tevet , Roy Kapon and Amit H. Bermano Tel Aviv University, Israel {Shafir2, guytevet}@mail.tau.ac.il |
| Pseudocode | Yes | Algorithm 1 Fine-tuning method; Algorithm 2 Sampling method |
| Open Source Code | Yes | 1Our code and trained models are available at https://github.com/prior MDM/prior MDM. |
| Open Datasets | Yes | For long sequence generation with our Double Take method, we use a fixed MDM (Tevet et al., 2023) trained on the Human ML3D (Guo et al., 2022) dataset, originally trained with up to 10 seconds long motions. To compare with TEACH (Athanasiou et al., 2022), which was dedicatedly trained for this task, we train MDM for 1.25M steps on BABEL (Punnakkal et al., 2021), the same dataset TEACH was trained on... We train and evaluate Com MDM with the CMU-Mocap (CMU) and the 3DPW (Von Marcard et al., 2018) datasets, which contain 55 and 27 two-person motion sequences respectively. |
| Dataset Splits | No | The paper mentions using 'test sets' for evaluation but does not provide explicit train/validation/test dataset splits (e.g., percentages, sample counts, or specific citations to predefined splits) for reproducibility for all datasets mentioned. |
| Hardware Specification | Yes | on a single NVIDIA Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions various models and frameworks (e.g., CLIP, SMPL, DDPM, MDM) and programming concepts, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | For both datasets, we applied Double Take with a one-second-long transition length, T = 700, Mhard = 0.85, Msoft = 0.1 and b = 10. |