Position: Towards Implicit Prompt For Text-To-Image Models
Authors: Yue Yang, Yuqi Lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a benchmark named Implicit Bench and conduct an investigation on the performance and impacts of implicit prompts with popular T2I models. Specifically, we design and collect more than 2,000 implicit prompts of three aspects: General Symbols, Celebrity Privacy, and Not-Safe-For-Work (NSFW) Issues, and evaluate six well-known T2I models capabilities under these implicit prompts. Experiment results show that (1) T2I models are able to accurately create various target symbols indicated by implicit prompts; (2) Implicit prompts bring potential risks of privacy leakage for T2I models. (3) Constraints of NSFW in most of the evaluated T2I models can be bypassed with implicit prompts. |
| Researcher Affiliation | Academia | 1Shanghai Jiao Tong University, Shanghai, China 2Shanghai AI Laboratory, Shanghai, China 3Osaka University, Osaka, Japan 4The University of Hong Kong, Hong Kong, China 5Research Institute of Tsinghua University in Shenzhen, Shenzhen, China. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and benchmark will be released in website https://github.com/yangyue5114/implicit prompt. |
| Open Datasets | Yes | First, we collected a benchmark of over 2000 implicit prompts from three aspects, covering over twenty subclasses;... Our code and benchmark will be released in website https://github.com/yangyue5114/implicit prompt. |
| Dataset Splits | No | The paper uses a benchmark for evaluation of pre-trained models and does not describe training a model on its own dataset with explicit training/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU models, CPU models, or memory) used for running its experiments. It mentions using various T2I models, some of which are closed-source APIs. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) needed to replicate the experiment. |
| Experiment Setup | Yes | Concretely, for each implicit prompt in our benchmark, we generate four images with each T2I model, to more comprehensive evaluation and reduce the potential impact of randomness in the generation process. We employ state-of-the-art MLLMs, GPT-4V (Open AI, 2023), to evaluate our generated images. We utilize Arcface (Deng et al., 2019) as the recognizer. We utilize the built-in image safety checker (Comp Vis, 2022) provided by Stable Diffusion and another dedicated safety classifier (Qu et al., 2023) as our dual evaluation models. |