Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation

Authors: Chenhui Zhu, Yilu Wu, Shuai Wang, Gangshan Wu, Limin Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments 4.1 Implementation Details 4.2 Datasets and Metrics 4.3 Results on General Domain 4.4 Zero-Shot Transfer to Specialized Domains 4.5 Ablation Studies
Researcher Affiliation Academia 1State Key Laboratory for Novel Software Technology, Nanjing University 2Shanghai AI Laboratory
Pseudocode No The paper describes its methodology in detail through textual explanations and mathematical formulations, but it does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code No Answer: [No] Justification: The code and weights will be open-sourced when this paper accepted.
Open Datasets Yes We construct three video retrieval databases using the GTE-v1.5 model [39] to encode video captions into embedding vectors... Our databases include: (1) Open Vid-1M [38]: A large-scale, general-domain video dataset. ... (2) Skill Vid [40]: A specialized dataset containing instructional and skill-based videos. ... (3) Intern Vid-10M [41]: A massive-scale video-text dataset originally curated for video understanding tasks.
Dataset Splits Yes We evaluate our method on two datasets: (1) Open Vid-1K, a diverse test set of 1,000 videos sampled from Open Vid-1M [38] with no overlap with the training data, representing general video domains; and (2) Skill Vid [40] test set, the test set of Skill Vid that we use to assess zero-shot capabilities.
Hardware Specification Yes Hardware 8 NVIDIA RTX A6000 GPUs
Software Dependencies No The paper mentions using specific models like Stable Video Diffusion (SVD) [1], Dynamicrafter [2], Cog Video X-5b [5], Video MAE-Base [30], DINOv2-Large [32], and GTE-base-1.5-en [39], as well as the Adam W optimizer. However, it does not provide version numbers for any programming languages, libraries, or core software components.
Experiment Setup Yes Table 8: Training hyperparameters for the two-stage approach across different models. Table 9: Generation hyperparameters for Open Vid-1K and Skill Vid dataset.