Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Authors: Hancheng Ye, Jiakang Yuan, Renqiu Xia, Xiangchao Yan, Tao Chen, Junchi Yan, Botian Shi, Bo Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on image and video diffusion models demonstrate that our method can significantly speed up the denoising process while generating identical results to the original process, achieving up to an average 2 5 speedup without quality degradation.
Researcher Affiliation Collaboration 1Shanghai Artificial Intelligence Laboratory 2School of Information Science and Technology, Fudan University 3School of Artificial Intelligence, Shanghai Jiao Tong University
Pseudocode Yes Algorithm 1 Greedy Search for the Optimal Skipping Path.
Open Source Code Yes The code is available at https://github.com/Uni Modal4Reasoning/ Adaptive Diffusion.
Open Datasets Yes Benchmark Datasets. Following [27], we use Image Net [7] and MS-COCO 2017 [21] to evaluate the results on class-conditional image generation and T2I tasks, respectively. For the I2V task, we randomly sample 100 prompts and reference images in AIGCBench [8]. For the T2V task, we conduct experiments on a widely-used benchmark MSR-VTT [45] and sample one caption for each video in the validation set as the test prompt.
Dataset Splits No While the paper mentions using the "validation set" of MSR-VTT as test prompts, it does not provide explicit details about the training/validation/test splits (e.g., percentages or sample counts) for any of the datasets used (ImageNet, MS-COCO, AIGCBench, MSR-VTT). It relies on implied standard splits for some benchmarks.
Hardware Specification Yes We conduct all experiments on RTX 3090 GPUs.
Software Dependencies No The paper mentions several software components like "Torch Metrics", "Clean FID", "DPM-Solver", "LDM codebase", "Deep Cache codebase", and "Diffusers", but it does not specify any version numbers for these software dependencies, which is required for reproducible description.
Experiment Setup Yes For SD-1-5 and SDXL models, the original sampling timesteps T are set as 50, and two hyperparameters are set as: δ = 0.01, Cmax = 4. For LDM-4, T = 250, δ = 0.005, Cmax = 10. For I2VGen-XL and Model Scope T2V, T = 50, δ = 0.007, Cmax = 4.