Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis

Authors: Zhiyu Jin, Xuli Shen, Bin Li, Xiangyang Xue

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results validate the efficacy of the proposed scaling factor, enabling models to achieve better visual effects, image quality, and text alignment. Notably, these improvements are achieved without additional training or fine-tuning techniques.
Researcher Affiliation Collaboration Zhiyu Jin1, Xuli Shen1,2, Bin Li1 , Xiangyang Xue1 1 Shanghai Key Laboratory of Intelligent Information Processing School of Computer Science, Fudan University 2 Uni DT Technology
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes We evaluate their performance without any training upon a subset of LAION-400M and LAION-5B dataset ([43, 42]), which contain over 400 million and 5.85 billion CLIP-filtered image-text pairs, respectively.
Dataset Splits No The paper mentions using LAION-400M and LAION-5B datasets and randomly choosing text-image pairs, but does not specify exact training, validation, or test dataset splits or percentages for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions the use of Stable Diffusion and Latent Diffusion models but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup No The paper describes the datasets and models used in the experimental setup but does not provide specific hyperparameter values or detailed training configurations (e.g., learning rate, batch size, epochs).