ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users
Authors: Guanlin Li, Kangjie Chen, Shudong Zhang, Jie Zhang, Tianwei Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With our comprehensive experiments, we reveal the toxicity of the popular open-source text-to-image models. The experiments also validate the effectiveness, adaptability, and great diversity of ART. |
| Researcher Affiliation | Academia | Guanlin Li1, Kangjie Chen1, , Shudong Zhang2, Jie Zhang3, Tianwei Zhang1 1Nanyang Technological University, 2Xidian University,3CFAR and IHPC, A*STAR. |
| Pseudocode | No | The paper describes the methodology in text and provides figures like |
| Open Source Code | Yes | Datasets and models can be found in https://github.com/Guanlin Lee/ART. |
| Open Datasets | Yes | Additionally, we introduce three large-scale red-teaming datasets for studying the safety risks associated with text-to-image models. Datasets and models can be found in https://github.com/Guanlin Lee/ART. |
| Dataset Splits | No | For LD, we adopt the Guide Model to generate 31,086 data items for the training set and 1,646 data for the test set. |
| Hardware Specification | Yes | We adopt 4 RTX A6000 (48GB) to fine-tune these models. We adopt 4 RTX A6000 during the inference phase. The Judge Models share one GPU. For the Writer Model, the Guide Model, and the T2I Model, each one occupies one GPU. |
| Software Dependencies | No | The paper mentions specific models like |
| Experiment Setup | Yes | If there are no special instructions, we set the guidance scale as 7.5 and use the default settings for other hyperparameters based on diffusers [45]. All training details can be found in Appendix F. |