Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models
Authors: Yuchen Fu, Zhiwei Jiang, Yuliang Liu, Cong Wang, Zexuan Deng, Zhaoling Chen, Qing Gu
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We curate a multi-modal, multi-model dataset that includes multiple diffusion models and their corresponding text-image data, and conduct experiments under a model generalization setting. The experimental results demonstrate the AP-Adapter s ability to enable the automatic prompts to generalize well to previously unseen diffusion models, generating high-quality images. |
| Researcher Affiliation | Academia | State Key Laboratory for Novel Software Technology, Nanjing University, China EMAIL, EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Training pipeline |
| Open Source Code | No | Our contributions include the dataset we collected and the code for model training and testing. We will release the data and code after the paper is accepted. |
| Open Datasets | No | Data Collection.We sourced high-quality images and personalized SD checkponts from the CIVITAI community. We collected 47,695 image-text pairs gathered from various checkpoints, ensuring privacy protection. Further analysis of our dataset is provided in the Appendix B.1. |
| Dataset Splits | Yes | The source domain encompasses 7075 samples, whereas the target domain comprises 3064 samples. |
| Hardware Specification | Yes | In the Prototype-Based Prompt Adaptation stage, all models are trained on two NVIDIA RTX 3090 GPUs, with steps set to 10000, batch size set to 16, and image resolution set to 512. |
| Software Dependencies | Yes | As for the platform to implement our network, we use Py Torch 2.1. |
| Experiment Setup | Yes | During the training phase, we retrieve 5 pairs of natural language prompts and manually designed prompts as demonstrations for ICL from the dataset. [...] For the model s parameter settings, since the source domain data contains 40 checkpoints, the number of domain prototypes S is set to 40. The coefficients γ1, γ2, γ3, γ4 for the loss functions are 0.01, 1.0, 0.001 and 1.0, respectively. [...] with steps set to 10000, batch size set to 16, and image resolution set to 512. |