Image generation with shortest path diffusion

Authors: Ayan Das, Stathi Fotiadis, Anil Batra, Farhang Nabiei, Fengting Liao, Sattar Vakili, Da-Shan Shiu, Alberto Bernacchia

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our Shortest Path Diffusion (SPD) on CIFAR10. We show that any small departure from the shortest path results in worse performance, and SPD outperforms all methods based on image blurring. Our results suggest that SPD provides the optimal corruption. We also test SPD on Imag Net 64 64, on the task of unconditional generation, and we show that SPD improves on strong baselines without any hyperparameter tuning.
Researcher Affiliation Collaboration 1Media Tek Research, Cambourne, UK 2Department of Bioengineering, Imperial College London, London, UK 3School of Informatics, University of Edinburgh, Edinburgh, UK.
Pseudocode Yes Algorithm 1 Shortest Path Diffusion (batch size = 1) Algorithm 2 Image generation (reverse process)
Open Source Code Yes Our code is available at https://github.com/mtkresearch/shortest-path-diffusion
Open Datasets Yes We use CIFAR10 (Krizhevsky, 2009) and Image Net (Deng et al., 2009), two of the most frequently used benchmarks for evaluating generative models on images.
Dataset Splits Yes We use 50, 000 samples for CIFAR10 and 10, 000 samples for Image Net 64 64, following Nichol & Dhariwal (2021a).
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, memory amounts) were provided for running experiments. The paper does not mention the type of hardware used for training or inference.
Software Dependencies No The paper mentions using a 'slight modification of the codebase in Dhariwal & Nichol (2021)' and 'Adam optimizer', but no specific software versions (e.g., Python, PyTorch, CUDA versions) are provided for reproducibility.
Experiment Setup Yes We used Adam optimizer with learning rate of 1 10 4. For CIFAR10, we use batch size 1024, 150, 000 training iterations, and we record model checkpoints every 5, 000 iterations. For Image Net 64 64, we use batch size 336, 1M training iterations, and we record model checkpoints every 3, 000 iterations.