AID: Attention Interpolation of Text-to-Image Diffusion
Authors: He Qiyuan, Jinghao Wang, Ziwei Liu, Angela Yao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our method achieves greater consistency, smoothness, and efficiency in condition-based interpolation, aligning closely with human preferences. |
| Researcher Affiliation | Academia | Qiyuan He1 Jinghao Wang2 Ziwei Liu2 Angela Yao1, 1National University of Singapore 2S-Lab, Nanyang Technological University |
| Pseudocode | Yes | Algorithm 1 Exploration with Beta prior and Algorithm 2 Search smoothest sequence are presented in Appendix D. |
| Open Source Code | Yes | Our code and demo are available at https://qyh00.github.io/attention-interpolation-diffusion/. |
| Open Datasets | Yes | Our proposed framework is evaluated using corpora from CIFAR-10 [22] and the LAION-Aesthetics dataset from the larger LAION-5B collection [39]. |
| Dataset Splits | No | The paper describes sampling methods for trials and iterations but does not explicitly provide training, validation, or test dataset splits for model evaluation. |
| Hardware Specification | Yes | All quantitative and qualitative experiments presented in this work are conducted on a single H100 GPU and Float16 precision. |
| Software Dependencies | Yes | We use Stable Diffusion 1.4 [35] as the base model to implement our attention interpolation mechanism for quantitative evaluation. |
| Experiment Setup | Yes | In all experiments, a 512 × 512 image is generated with the DDIM Scheduler [42] and DPM Scheduler [26] within 25 timesteps. In terms of Bayesian optimization on α and β in the beta prior to applying our selection approach, we set the smoothness of the interpolation sequence as the objective target, [1, 15] as the range of both hyperparameters, 9 fixed exploration where α and β are chosen from {10, 12, 14}, and 15 iterations to optimize. |