Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Authors: Zhiyu Jin, Xuli Shen, Bin Li, Xiangyang Xue
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results validate the efficacy of the proposed scaling factor, enabling models to achieve better visual effects, image quality, and text alignment. Notably, these improvements are achieved without additional training or fine-tuning techniques. |
| Researcher Affiliation | Collaboration | Zhiyu Jin1, Xuli Shen1,2, Bin Li1 , Xiangyang Xue1 1 Shanghai Key Laboratory of Intelligent Information Processing School of Computer Science, Fudan University 2 Uni DT Technology |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We evaluate their performance without any training upon a subset of LAION-400M and LAION-5B dataset ([43, 42]), which contain over 400 million and 5.85 billion CLIP-filtered image-text pairs, respectively. |
| Dataset Splits | No | The paper mentions using LAION-400M and LAION-5B datasets and randomly choosing text-image pairs, but does not specify exact training, validation, or test dataset splits or percentages for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of Stable Diffusion and Latent Diffusion models but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | No | The paper describes the datasets and models used in the experimental setup but does not provide specific hyperparameter values or detailed training configurations (e.g., learning rate, batch size, epochs). |