Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MotionBooth: Motion-Aware Customized Text-to-Video Generation
Authors: Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive quantitative and qualitative evaluations demonstrate the superiority and effectiveness of our method. |
| Researcher Affiliation | Collaboration | Jianzong Wu1,3, Xiangtai Li2,3 , Yanhong Zeng3, Jiangning Zhang4, Qianyu Zhou5, Yining Li3, Kai Chen3, Yunhai Tong1 1PKU 2S-Lab, NTU 3Shanghai AI Laboratory 4ZJU 5SJTU |
| Pseudocode | Yes | Pseudo-code of latent shift. To present the latent shift module more clearly, we show the pseudo-code of the algorithm in Fig. 14. |
| Open Source Code | No | We are not able to provide the code at submission time. But we are making sure that our code and models will be released publically in the future. |
| Open Datasets | Yes | For customization, we collect a total of 26 objects from Dream Booth [42] and Custom Diffusion [30]. |
| Dataset Splits | No | The paper describes the datasets used for customization and evaluation, but does not explicitly provide training, validation, and test dataset splits with percentages or sample counts for these datasets. |
| Hardware Specification | Yes | The training process finishes in around 10 minutes in a single NVIDIA A100 80G GPU. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer' and 'DDIM scheduler' but does not specify their version numbers or the versions of other key software dependencies like PyTorch. |
| Experiment Setup | Yes | We train Motion Booth for 300 steps using the Adam W optimizer, with a learning rate of 5e-2 and a weight decay of 1e-2... The loss weight parameters λ1 and λ2 are set to 1.0 and 0.01. We use Zeroscope and La Vie as base models. During inference, we perform 50-step denoising using the DDIM scheduler and set the classifier-free guidance scale to 7.5. The generated videos are 576x320x24 and 512x320x16 for Zeroscope and La Vie, respectively. |