Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Authors: Ariel Shaulov, Itay Hazan, Lior Wolf, Hila Chefer

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct qualitative and quantitative experiments to demonstrate Flow Mo s effectiveness. Our experiments evaluate the improvement in temporal coherence enabled by our method, as well as its ability to maintain or even enhance other aspects of the generation, such as appearance quality and text alignment.
Researcher Affiliation	Academia	Ariel Shaulov Itay Hazan Lior Wolf Hila Chefer School of Computer Science Tel Aviv University, Israel
Pseudocode	Yes	Algorithm 1 A Single Flow Mo Denoising Step
Open Source Code	Yes	Our code is submitted as supplemental material, and will be published as an open source repository once the paper is accepted.
Open Datasets	Yes	We employ both the VBench benchmark [16] and human-based evaluations, which serve as the standard evaluation protocols for measuring the quality of text-to-video generation [2, 65, 66, 67, 68]. We conduct a human preference study using the Video JAM benchmark [3], which was specifically designed to test motion coherence.
Dataset Splits	No	Each prompt was evaluated by five different participants, resulting in 640 unique responses per baseline.
Hardware Specification	Yes	All our experiments employ a learning rate of η = 0.005, using the Adam optimizer, on two NVIDIA H100 GPUs, with 80GB memory each.
Software Dependencies	No	95%-confidence interval was computed using the seaborn python package.
Experiment Setup	Yes	All our experiments employ a learning rate of η = 0.005, using the Adam optimizer, on two NVIDIA H100 GPUs, with 80GB memory each. Wan2.1 is evaluated at a resolution of 480 832, and Cog Video X at 480 720, both generating 81 frames at 16 frames per second, resulting in 5-second videos. Motivated by the insights from Sec. 3.2, we apply Flow Mo in the first 12 timesteps of the generation, since these are responsible for coarse motion and structure.