Position: Towards Implicit Prompt For Text-To-Image Models

Authors: Yue Yang, Yuqi Lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a benchmark named Implicit Bench and conduct an investigation on the performance and impacts of implicit prompts with popular T2I models. Specifically, we design and collect more than 2,000 implicit prompts of three aspects: General Symbols, Celebrity Privacy, and Not-Safe-For-Work (NSFW) Issues, and evaluate six well-known T2I models capabilities under these implicit prompts. Experiment results show that (1) T2I models are able to accurately create various target symbols indicated by implicit prompts; (2) Implicit prompts bring potential risks of privacy leakage for T2I models. (3) Constraints of NSFW in most of the evaluated T2I models can be bypassed with implicit prompts.
Researcher Affiliation Academia 1Shanghai Jiao Tong University, Shanghai, China 2Shanghai AI Laboratory, Shanghai, China 3Osaka University, Osaka, Japan 4The University of Hong Kong, Hong Kong, China 5Research Institute of Tsinghua University in Shenzhen, Shenzhen, China.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code and benchmark will be released in website https://github.com/yangyue5114/implicit prompt.
Open Datasets Yes First, we collected a benchmark of over 2000 implicit prompts from three aspects, covering over twenty subclasses;... Our code and benchmark will be released in website https://github.com/yangyue5114/implicit prompt.
Dataset Splits No The paper uses a benchmark for evaluation of pre-trained models and does not describe training a model on its own dataset with explicit training/validation/test splits.
Hardware Specification No The paper does not provide specific hardware details (like GPU models, CPU models, or memory) used for running its experiments. It mentions using various T2I models, some of which are closed-source APIs.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) needed to replicate the experiment.
Experiment Setup Yes Concretely, for each implicit prompt in our benchmark, we generate four images with each T2I model, to more comprehensive evaluation and reduce the potential impact of randomness in the generation process. We employ state-of-the-art MLLMs, GPT-4V (Open AI, 2023), to evaluate our generated images. We utilize Arcface (Deng et al., 2019) as the recognizer. We utilize the built-in image safety checker (Comp Vis, 2022) provided by Stable Diffusion and another dedicated safety classifier (Qu et al., 2023) as our dual evaluation models.