Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Diffusion Models as Cartoonists: The Curious Case of High Density Regions
Authors: Rafał Karczewski, Markus Heinonen, Vikas Garg
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical findings reveal the existence of significantly higher likelihood samples that typical samplers do not produce, often manifesting as cartoon-like drawings or blurry images depending on the noise level. Curiously, these patterns emerge in datasets devoid of such examples. We also present a novel approach to track sample likelihoods in diffusion SDEs, which remarkably incurs no additional computational cost. Code is available at https://github.com/Aalto-Qu ML/high-density-diffusion |
| Researcher Affiliation | Collaboration | Rafał Karczewski, Markus Heinonen Aalto University EMAIL Vikas Garg Yai Yai Ltd and Aalto University EMAIL |
| Pseudocode | Yes | Algorithm 1 High density sampling 1: Input: Threshold t (0, T] 2: Initial x T N(0, σ2 T ID) 3: Sample xt pt(xt) eq. 6 or 7 4: y0 HD-ODE(t, 0, xt) eq. 18 5: Return y0 |
| Open Source Code | Yes | Code is available at https://github.com/Aalto-Qu ML/high-density-diffusion |
| Open Datasets | Yes | As a demonstration, we train two versions of a diffusion model on CIFAR-10 (Krizhevsky et al., 2009), one with maximum likelihood training (Kingma et al., 2021; Song et al., 2021b) and one optimized for sample quality (Kingma & Gao, 2024). ... 2We used FFHQ-256 and Churches-256 models from github.com/yang-song/score_sde_ pytorch and Image Net-64 from github.com/NVlabs/edm |
| Dataset Splits | No | The paper mentions using CIFAR-10 for training and FFHQ-256 for testing, but does not specify the explicit splits (e.g., percentages, sample counts, or citations to predefined splits) used for these datasets during training, validation, or testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for conducting the experiments. |
| Software Dependencies | No | The paper mentions that the UNET parametrization uses an implementation from 'docs.kidger.site/equinox/examples/unet/', implying the use of the Equinox library. However, no specific version number for Equinox or any other key software dependencies (like Python, PyTorch, CUDA) is provided. |
| Experiment Setup | Yes | Specifically, these models are Variance Preserving (VP) SDEs with a linear log-SNR noise schedule and εparametrization... with hyperparameters: is biggan=True, dim mults=(1, 2, 2, 2), hidden size=128, heads=8, dim head=16, dropout rate=0.1, num res blocks=4, attn resolutions=[16]; trained for 2M steps, 128 batch size, and the adaptive noise schedule from Kingma & Gao (2024) with EMA weight 0.99. |