Human Motion Diffusion as a Generative Prior

Authors: Yoni Shafir, Guy Tevet, Roy Kapon, Amit Haim Bermano

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks.
Researcher Affiliation Academia Yonatan Shafir , Guy Tevet , Roy Kapon and Amit H. Bermano Tel Aviv University, Israel {Shafir2, guytevet}@mail.tau.ac.il
Pseudocode Yes Algorithm 1 Fine-tuning method; Algorithm 2 Sampling method
Open Source Code Yes 1Our code and trained models are available at https://github.com/prior MDM/prior MDM.
Open Datasets Yes For long sequence generation with our Double Take method, we use a fixed MDM (Tevet et al., 2023) trained on the Human ML3D (Guo et al., 2022) dataset, originally trained with up to 10 seconds long motions. To compare with TEACH (Athanasiou et al., 2022), which was dedicatedly trained for this task, we train MDM for 1.25M steps on BABEL (Punnakkal et al., 2021), the same dataset TEACH was trained on... We train and evaluate Com MDM with the CMU-Mocap (CMU) and the 3DPW (Von Marcard et al., 2018) datasets, which contain 55 and 27 two-person motion sequences respectively.
Dataset Splits No The paper mentions using 'test sets' for evaluation but does not provide explicit train/validation/test dataset splits (e.g., percentages, sample counts, or specific citations to predefined splits) for reproducibility for all datasets mentioned.
Hardware Specification Yes on a single NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions various models and frameworks (e.g., CLIP, SMPL, DDPM, MDM) and programming concepts, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes For both datasets, we applied Double Take with a one-second-long transition length, T = 700, Mhard = 0.85, Msoft = 0.1 and b = 10.