reproducibilityindex.ai

Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model

Authors: Min Zhao, Hongzhou Zhu, Chendong Xiang, Kaiwen Zheng, Chongxuan LI, Jun Zhu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate these general strategies on various I2V-DMs on our collected open-domain image benchmark and the UCF101 dataset. Extensive results show that our methods outperform baselines by producing higher motion scores with lower errors while maintaining image alignment and temporal consistency, thereby yielding superior overall performance and enabling more accurate motion control.
Researcher Affiliation	Collaboration	Min Zhao1,3 , Hongzhou Zhu1,3 , Chendong Xiang1,3, Kaiwen Zheng1,3, Chongxuan Li2 , Jun Zhu1,3,4 1Dept. of Comp. Sci. & Tech., BNRist Center, THU-Bosch ML Center, Tsinghua University 2 Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 3Sheng Shu, Beijing, China; 4Pazhou Laboratory (Huangpu), Guangzhou, China
Pseudocode	Yes	Algorithm 1 Sampling from an I2V diffusion model with Analytic-Init
Open Source Code	Yes	The project page: https://cond-image-leak.github.io/. All used codes in this paper and their licenses are listed in Tab. 3.
Open Datasets	Yes	We use Web Vid-2M [2] as the training dataset... For evaluation, we use UCF101 [49] and our Image Bench dataset
Dataset Splits	No	The paper mentions Web Vid-2M as the training dataset and UCF101 and Image Bench for evaluation, along with sample counts for FVD and IS on UCF101. However, it does not explicitly provide specific train/validation/test dataset splits with percentages or counts for its experiments.
Hardware Specification	Yes	Our experiments were conducted using A800-80G GPUs, and the computational costs are detailed in Tab. 6. Table 6: Compute resources. Model Iterations GPU-type GPU-nums Hours Dynami Crafter [63] 20,000 A800 8 8 Video Crafter1 [12] 20,000 A800 8 8 SVD [9] 20,000 A800 6 7
Software Dependencies	No	The paper lists various existing model implementations and their licenses in Table 3, but it does not specify programming language versions (e.g., Python 3.x) or specific library versions (e.g., PyTorch 1.x) required to replicate the experiments.
Experiment Setup	Yes	Table 4: Training settings for Dynami Crafter [63] and Video Crafter1 [12]. Config Value Optimizer Adam W Learning rate 1e-5 Weight decay 1e-2 Optimizer momentum β1, β2=0.9, 0.999 Batch size 64 Training iterations 20,000