Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
FrameBridge: Improving Image-to-Video Generation with Bridge Models
Authors: Yuji Wang, Zehua Chen, Chen Xiaoyu, Yixiang Wei, Jun Zhu, Jianfei Chen
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments conducted on Web Vid-2M and UCF-101 demonstrate the superior quality of Frame Bridge in comparison with the diffusion counterpart (zero-shot FVD 95 vs. 192 on MSR-VTT and non-zero-shot FVD 122 vs. 171 on UCF-101), and the advantages of our proposed SAF and neural prior for bridge-based I2V models. |
| Researcher Affiliation | Collaboration | 1Dept. of Comp. Sci. and Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua-Bosch Joint ML Center, Tsinghua University 2Sheng Shu, Beijing, China. Correspondence to: Jianfei Chen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Training algorithms for I2V diffusion models. Algorithm 2 Sampling algorithms for Frame Bridge. Algorithm 3 Sampling algorithms for I2V diffusion models. Algorithm 4 Training algorithms for Frame Bridge. |
| Open Source Code | No | The project page: https: //framebridge-icml.github.io/. The text mentions a project page, which is often a high-level overview or demonstration page rather than a direct link to a code repository, and does not explicitly state that code is provided. |
| Open Datasets | Yes | Experiments conducted on Web Vid-2M (Bain et al., 2021) and UCF-101 (Soomro, 2012) demonstrate the superior quality of Frame Bridge... zero-shot FVD 95 vs. 192 on MSR-VTT (Xu et al., 2016) |
| Dataset Splits | Yes | UCF-101 is an open-sourced video dataset consisting of 13320 videos clips, and each video clip are categorized into one of the 101 action classes. There are three official train-test split, each of which divide the whole dataset into 9537 training video clips and 3783 test video clips. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments. It mentions 'computational resources' but no explicit specifications. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer' and 'BFloat16' (a data type), but does not provide specific version numbers for any ancillary software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We fine-tune the models ϵ ˆΨ for 20k iterations or 100k iterations with batch size 64. We use the Adam W optimizer with learning rate 1 10 5 and mixed precision of BFloat16. |