reproducibilityindex.ai

AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models

Authors: Yuchen Fu, Zhiwei Jiang, Yuliang Liu, Cong Wang, Zexuan Deng, Zhaoling Chen, Qing Gu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We curate a multi-modal, multi-model dataset that includes multiple diffusion models and their corresponding text-image data, and conduct experiments under a model generalization setting. The experimental results demonstrate the AP-Adapter s ability to enable the automatic prompts to generalize well to previously unseen diffusion models, generating high-quality images.
Researcher Affiliation	Academia	State Key Laboratory for Novel Software Technology, Nanjing University, China yuchenfu@smail.nju.edu.cn, jzw@nju.edu.cn {yuliangliu,cw,dengzx,zhaolingchen}@smail.nju.edu.cn guq@nju.edu.cn
Pseudocode	Yes	Algorithm 1 Training pipeline
Open Source Code	No	Our contributions include the dataset we collected and the code for model training and testing. We will release the data and code after the paper is accepted.
Open Datasets	No	Data Collection.We sourced high-quality images and personalized SD checkponts from the CIVITAI community. We collected 47,695 image-text pairs gathered from various checkpoints, ensuring privacy protection. Further analysis of our dataset is provided in the Appendix B.1.
Dataset Splits	Yes	The source domain encompasses 7075 samples, whereas the target domain comprises 3064 samples.
Hardware Specification	Yes	In the Prototype-Based Prompt Adaptation stage, all models are trained on two NVIDIA RTX 3090 GPUs, with steps set to 10000, batch size set to 16, and image resolution set to 512.
Software Dependencies	Yes	As for the platform to implement our network, we use Py Torch 2.1.
Experiment Setup	Yes	During the training phase, we retrieve 5 pairs of natural language prompts and manually designed prompts as demonstrations for ICL from the dataset. [...] For the model s parameter settings, since the source domain data contains 40 checkpoints, the number of domain prototypes S is set to 40. The coefficients γ1, γ2, γ3, γ4 for the loss functions are 0.01, 1.0, 0.001 and 1.0, respectively. [...] with steps set to 10000, batch size set to 16, and image resolution set to 512.