reproducibilityindex.ai

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models

Authors: Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our approach can address the repetition issue well and achieve state-of-the-art performance on higher-resolution image synthesis, especially in texture details.
Researcher Affiliation	Collaboration	1Hong Kong University of Science and Technology 2Chinese Academy of Sciences 3Tencent AI Lab
Pseudocode	No	The paper describes the proposed methods (re-dilation, convolution dispersion, noise-damped classifier-free guidance) in detail within the text and using mathematical formulations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The abstract mentions a project website: 'More results are available at the project website: https://yingqinghe.github.io/scalecrafter/'. Upon visiting the website, it states 'Code is coming soon...'. Therefore, the code is not currently available.
Open Datasets	Yes	We evaluate performance on the dataset of Laion-5B (Schuhmann et al., 2022) which contains 5 billion image-caption pairs.
Dataset Splits	No	The paper is 'tuning-free' and evaluates on samples from a dataset. It states 'When the inference resolution is 1024x1024, we sample 30k images with randomly sampled text prompts from the dataset. Due to massive computation, we sample 10k images when the inference resolution is higher than 1024x1024.' This describes the evaluation process, but not a specific train/validation/test split for training or fine-tuning their method.
Hardware Specification	Yes	Time indicates the second used for synthesizing one image on one A100 GPU with 16-bit precision).
Software Dependencies	No	The paper mentions '16-bit precision' in relation to hardware, and refers to 'diffusers' for naming conventions of layers, but it does not specify any software libraries, frameworks, or their version numbers that would be necessary for reproduction (e.g., PyTorch version, CUDA version).
Experiment Setup	Yes	We list the hyperparameters for SD 1.5 only for brevity. The evaluation settings for SD 1.5 are shown in Tab. 6, 7, 8, 9. The settings for SD XL 1.0 are shown in Fig. 10, 11, 12, 13. (These tables list specific values for latent resolution, re-dilated blocks, dilation scale, dispersed blocks, inference timesteps, etc.)