reproducibilityindex.ai

Denoising Diffusion Bridge Models

Authors: Linqi Zhou, Aaron Lou, Samar Khanna, Stefano Ermon

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we apply DDBMs to challenging image datasets in both pixel and latent space. On standard image translation problems, DDBMs achieve significant improvement over baseline methods, and, when we reduce the problem to image generation by setting the source distribution to random noise, DDBMs achieve comparable FID scores to state-of-the-art methods despite being built for a more general task. We evaluate on datasets with different image resolutions to demonstrate its applicability on a variety of scales. For evaluation metrics, we use Fréchet Inception Distance (FID) (Heusel et al., 2017) and Inception Scores (IS) (Barratt and Sharma, 2018) evaluated on all training samples translation quality, and we use LPIPS (Zhang et al., 2018) and MSE (in [ 1, 1] scale) to measure perceptual similarity and translation faithfulness. We now study the effect of our preconditioning and hybrid samplers on generation quality in the context of both VE and VP bridge (see Appendix B for VP bridge parameterization). In the left column of Figure 4, we fix the guidance scale w at 1 and vary the Euler step size s from 0 to 0.9 to introduce stochasticity. We see a significant decrease in FID score as we increase s which produces the best performance at some value between 0 and 1 (e.g., s = 0.3 for Edges Handbags). Table 3: Ablation study on the effect of sampler and preconditioning on FID.
Researcher Affiliation	Academia	Linqi Zhou Aaron Lou Samar Khanna Stefano Ermon Department of Computer Science, Stanford University {linqizhou, aaronlou, samar.khanna, ermon}@stanford.edu
Pseudocode	Yes	We introduce additional scaling hyperparameter s, which define a step ratio in between ti 1 and ti such that the interval [ti s(ti ti 1), ti] is used for Euler-Maruyama steps and [ti 1, ti s(ti ti 1)] is used for Heun steps, as described in Algorithm 1.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets	Yes	We choose Edges Handbags (Isola et al., 2017) scaled to 64 64 pixels, which contains image pairs for translating from edge maps to colored handbags, and DIODE-Outdoor (Vasiljevic et al., 2019) scaled to 256 256, which contains normal maps and RGB images of real-world outdoor scenes. We evaluate our method on CIFAR-10 (Krizhevsky et al., 2009) and FFHQ-64 64 (Karras et al., 2019) which are processed according to Karras et al. (2022).
Dataset Splits	No	The paper mentions evaluating on "all training samples translation quality" for some metrics and processing datasets "according to Karras et al. (2022)" for others, but it does not explicitly state specific train/validation/test splits, percentages, or absolute counts for any dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Unless noted otherwise, we use the same VE diffusion schedule as in EDM for our bridge model by default. In the left column of Figure 4, we fix the guidance scale w at 1 and vary the Euler step size s from 0 to 0.9 to introduce stochasticity. Diffusion and transport-based methods are evaluated with the same number of function evaluations (N = 40, which is the default for EDM sampler for 64 64 images).