Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback
Authors: Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show the significantly superior performance of our method over existing state-of-the-art approaches. In our quantitative experiments on both the Human ML3D and KIT datasets, Hu Tu Motion significantly outperforms existing state-of-the-art methods. Additionally, through qualitative experiments, we observe that our method generates more natural and semantically correct motions. |
| Researcher Affiliation | Collaboration | 1College of Information Engineering, Northwest A&F University 2Tencent AI Lab 3School of Mathematics and Statistics, The University of Melbourne 4Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates |
| Pseudocode | Yes | Algorithm 1: Distribution optimization for representative texts |
| Open Source Code | No | The paper does not contain an explicit statement that the source code for the methodology is open-source or provides a direct link to a code repository. |
| Open Datasets | Yes | We experiment with two text-to-motion synthesis datasets: Human ML3D (Guo et al. 2022b) and KIT (Plappert, Mandery, and Asfour 2016). |
| Dataset Splits | Yes | The dataset downsampled to 12.5 FPS, is partitioned into 80% training, 5% validation, and 15% test sets. |
| Hardware Specification | Yes | Our representative distribution optimization and semantically guided generation are conducted on a single NVIDIA Ge Force RTX 2080 Ti GPU |
| Software Dependencies | No | The paper mentions using "MLD (Chen et al. 2023)" and "DDIM (Song, Meng, and Ermon 2020) as the sampler," but it does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Our representative distribution optimization and semantically guided generation are conducted on a single NVIDIA Ge Force RTX 2080 Ti GPU, with text embedding and latent dimensions set to 768 and 256, respectively. We set σ to 0.2 for latent sampling and use DDIM (Song, Meng, and Ermon 2020) as the denoising motion diffusion sampler. All other settings are consistent with MLD (Chen et al. 2023). |