Stable Diffusion is Unstable

Authors: Chengbin Du, Yanxi Li, Zhongwei Qiu, Chang Xu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental our research has uncovered a lack of robustness in this generation process. ... ATM has achieved a 91.1% success rate in short-text attacks and an 81.2% success rate in long-text attacks. Further empirical analysis revealed three attack patterns... drawing upon extensive experiments and empirical analyses employing ATM, we are able to disclose the existence of three distinct attack patterns... In our experiments, we conduct comprehensive analyses of both long and short prompts. Furthermore, we conduct ablation studies specifically on long prompts...
Researcher Affiliation Academia Chengbin Du, Yanxi Li, Zhongwei Qiu, Chang Xu School of Computer Science, Faculty of Engineering University of Sydney, Australia chdu5632@uni.sydney.edu.au, yali0722@uni.sydney.edu.au zhongwei.qiu@sydney.edu.au, c.xu@sydney.edu.au
Pseudocode Yes Algorithm 1 Auto-attack on Text-to-image Models (ATM)
Open Source Code Yes The code is available at https://github.com/duchengbin8/Stable_Diffusion_is_Unstable
Open Datasets Yes An extensive dataset comprising one thousand abbreviated textual descriptions, corresponding to all classes within Image Net [3], was meticulously curated. ... For clean short prompts, we employ a standardized template: "A photo of [CLASS_NAME]". Clean long prompts, on the other hand, are generated using Chat GPT 4 [24]...
Dataset Splits Yes FID and IS are computed by comparing the generated images to the Image Net-1K validation set with (torch-fidelity)[22].
Hardware Specification No The paper does not provide any specific details regarding the hardware used for running the experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions several software components like CLIP, Chat GPT 4, torch-fidelity, DDIM, and DPM-Solver, but it does not specify any version numbers for these software dependencies.
Experiment Setup Yes The number of search iterations T is set to 100. This value determines the number of iterations in the search stage... The number of attack candidates N is set to 100. The learning rate η for the matrix ω is set to 0.3. The margin κ in the margin loss is set to 30.