Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Authors: Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Motion Clone exhibits proficiency in both global camera motion and local object motion, with notable superiority in terms of motion fidelity, textual alignment, and temporal consistency. The paper includes sections such as '4 EXPERIMENTS', '4.3 QUALITATIVE COMPARISON', '4.4 QUANTITATIVE COMPARISON', and '4.6 ABLATION AND ANALYSIS' which detail empirical studies and data analysis.
Researcher Affiliation	Academia	1University of Science and Technology of China 2Shanghai Jiao Tong University 3The Chinese University of Hong Kong 4Shanghai AI Laboratory
Pseudocode	No	The paper describes methods using mathematical equations and prose but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	https://github.com/LPeng Yang/Motion Clone
Open Datasets	Yes	For experimental evaluation, 40 real videos sourced from DAVIS (Pont-Tuset et al., 2017) and website are utilized for a thorough analysis, comprising 15 videos with camera motion and 25 videos for object motion.
Dataset Splits	No	The paper mentions using 40 real videos for experimental evaluation but does not provide specific details on training, validation, or test dataset splits. It describes the total number of videos and their categories but not how they were partitioned for experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU, GPU models, or memory used for running the experiments.
Software Dependencies	No	The paper mentions using 'Animate Diff(Guo et al., 2023b) as the base text-to-video generation model and leverage Sparse Ctrl (Guo et al., 2023a) for image-to-video and sketch-to-video generator.' However, it does not provide specific version numbers for these or any other software dependencies like programming languages, libraries, or frameworks.
Experiment Setup	Yes	For given real videos, we apply single denoising in tα = 400 for motion representation extraction. k = 1 is adopted for mask in Eq. 5 to facilitate sparse constraint. null-text is uniformly used as textual prompt for preparing motion representations, promoting a more convenient video customization. The motion guidance is conducted on temporal attention layers in up block.1 . The detailed ablations of above setting are represented in 4.6. Guidance weight s and λ in Eq. 2 are empirically set as 7.5, and 2000, respectively. For camera motion cloning, the denoising step is configured to 100, in which the motion guidance steps set as 50. For object motion cloning, the denoising step is raised to 300, while applying motion guidance in the early 180 steps.