Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring

Authors: Jiewei Zhang, Song Guo, Peiran Dong, Jie Zhang, Ziming Liu, Yue Yu, Xiao-Ming Wu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate its superior capability in precisely generating multiple objects as specified in the textual prompts. Experimental results illustrate that our approach excels in accurately generating multiple objects. In this section, we will conduct a thorough qualitative and quantitative comparison of our method with existing approaches.
Researcher Affiliation Academia 1The Hong Kong Polytechnic University. 2Peng Cheng Laboratory. 3The Hong Kong University of Science and Technology..
Pseudocode Yes Algorithm 1 Entity Localization and Anchoring
Open Source Code No The paper does not provide an explicit statement about the release of source code or a link to a code repository for the described methodology.
Open Datasets No The paper uses generated images based on specified prompt formats (e.g., 'a [entity A] and a [entity B]') for evaluation. While the paper refers to sets of entities (e.g., '20 animals and objects'), it does not mention or provide access information for a publicly available, formal training dataset.
Dataset Splits No The paper focuses on evaluating generated images from prompts rather than training on a specific dataset with explicit train/validation/test splits. Therefore, it does not specify dataset split information.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies Yes Our algorithm is employed within the pre-trained stable diffusion V-1.4.
Experiment Setup Yes Specifically, we concentrate on cross-attention maps associated with entities mentioned in the prompt. These maps are primarily extracted in the upsampling block with a resolution of 16 16. ... We configure the start and end timesteps (Tstart, Tend) to establish meaningful constraints on entities. ... where λ serves as the weighting factor.