DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

Authors: Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate its superior performance across multiple scenarios, suggesting its promising potential in artistic product creation.
Researcher Affiliation Collaboration Namhyuk Ahn1, Junsoo Lee1, Chunggi Lee1,2, Kunhee Kim3, Daesik Kim1, Seung-Hun Nam1, Kibeom Hong4 1 NAVER WEBTOON AI 2 Harvard University 3 KAIST 4 Swatch On
Pseudocode No The paper does not contain any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm', nor does it present any structured, code-like blocks.
Open Source Code No Project page: https://nmhkahn.github.io/dreamstyler/. The paper provides a project page link, which primarily showcases results, but does not explicitly state that source code is provided or link directly to a code repository within the paper itself.
Open Datasets No We collected a set of 32 images representing various artistic styles, following the literature on style transfer (Tan et al. 2019). To evaluate text-to-image synthesis, we prepared 40 text prompts, as described in Suppl. The paper states they 'collected' and 'prepared' their own datasets, but does not provide specific access information (e.g., a link, DOI, or explicit statement of public availability) for these datasets.
Dataset Splits No The paper describes its training process and objectives, but it does not provide explicit details on how its collected dataset (32 images and 40 text prompts) is split into training, validation, and test sets with specific percentages or sample counts.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory specifications) used to conduct the experiments.
Software Dependencies No The paper mentions software components like 'Stable Diffusion' and 'BLIP-2', but it does not specify any version numbers for these or other software dependencies, which would be necessary for reproducibility.
Experiment Setup Yes Implementation details. We use T = 6 for multi-stage TI and utilize human feedback-based context prompts by default.