Text-to-Image Generation for Abstract Concepts
Authors: Jiayi Liao, Xu Chen, Qiang Fu, Lun Du, Xiangnan He, Xiang Wang, Shi Han, Dongmei Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluation results from human assessments and our newly designed metric concept score demonstrate the effectiveness of our framework in creating images that can sufficiently express abstract concepts. Through conducting experiments on the abstract branch of Word Net, we compare different approaches and design a new metric called concept score. The results indicate that prompts generated using our framework facilitate effective visualization of abstract concepts. Experimental results demonstrate the effectiveness of TIAC in this task. |
| Researcher Affiliation | Collaboration | 1 University of Science and Technology of China 2 Microsoft 3 Mo E Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China |
| Pseudocode | No | The paper describes the proposed framework and its stages in prose, but it does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not include any statement about releasing open-source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We construct two datasets based on abstract concepts in Word Net with different scales. The small-scale one contains 57 abstract concepts and the large-scale one contains 3,400 abstract concepts. Word Net is a lexical database where semantically similar words are grouped into a set of cognitive synonyms called a synset; thus each synset represents a unique concept with corresponding definition in the database. We introduce a dataset Simulacra Aesthetic Captions (SAC) (Pressman, Crowson, and Contributors 2022) to enhance LLMs in building the connection between intent and form. url https://github.com/JD-P/simulacraaesthetic-captions. |
| Dataset Splits | No | The paper describes the creation of small-scale and large-scale datasets for evaluation and human assessment, but it does not specify explicit training, validation, or test dataset splits for model training or evaluation in a reproducible manner. The models used (GPT-3.5, Stable Diffusion v2) are pre-trained. |
| Hardware Specification | No | The paper mentions the use of specific models like 'GPT-3.5 (text-davanci-003)' and 'Stable Diffusion v2 (v2-inference and checkpoint of 512-base-ema)', but it does not provide any details regarding the hardware (e.g., specific CPU/GPU models, memory) used to run these models or conduct the experiments. |
| Software Dependencies | No | The paper mentions the use of 'GPT-3.5 (text-davanci-003)' and 'Stable Diffusion v2', which are specific models, but it does not specify versions for underlying software dependencies such as programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | No | The paper describes the stages of the TIAC framework and general settings like using 'few-shot examples', but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings for training or fine-tuning. |