FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models

Authors: Gong Zhang, Kihyuk Sohn, Meera Hahn, Humphrey Shi, Irfan Essa

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show the effectiveness of Fine Style at following fine-grained text prompts and delivering visual quality faithful to the specified style, measured by CLIP scores and human raters. ... 5 Experiment
Researcher Affiliation Collaboration Gong Zhang1,2 Kihyuk Sohn3 Meera Hahn2 Humphrey Shi1 Irfan Essa1,2 1Georgia Tech 2Google Deep Mind 3Meta Reality Labs
Pseudocode Yes A.5 Derivation of Concept Attention Map ... 2 import jax.numpy as jnp ... def aggregate_xattn_by_phrase (
Open Source Code Yes Visit https://github.com/SHI-Labs/Fine Style for code and more examples. ... Justification: We will release the codes to public.
Open Datasets Yes We adopt the evaluation set from [41] containing 24 styles encompassing fine-art oil painting, 3D rendering, and sculpture. ... We synthesize images by combining a filtered version of Parti [50] prompts and 10 styles from the evaluation set, details in Appendix A.2. ... We utilize CLIP [35] to calculate Text (text-image) and Style (image-image) scores.
Dataset Splits No The paper mentions training models and evaluating on a 'Parti prompts' set, but it does not provide specific train/validation/test dataset splits or percentages for these processes.
Hardware Specification Yes We train Fine Stlye using Adam optimizer [23] on TPUv4 with a batch size of 8.
Software Dependencies No The paper mentions using Adam optimizer [23] but does not provide specific software names with version numbers for other dependencies (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes A.3 Implementation Details ... We train Fine Stlye using Adam optimizer [23] on TPUv4 with a batch size of 8. See Tab. 3 for detailed hyperparamters. Table 3: Hyperparameters for optimizer, adapter architecture, and synthesis.