Analyzing Diffusion as Serial Reproduction
Authors: Raja Marjieh, Ilia Sucholutsky, Thomas A Langlois, Nori Jacoby, Thomas L. Griffiths
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then complement our theoretical analysis with simulations that exhibit these key features. Our work highlights how classic paradigms in cognitive science can shed light on state-of-the-art machine learning problems. ... Finally, we also performed diffusion experiments with deep neural networks to validate that this dependence on the number of steps indeed occurs. Specifically, we trained a Denoising Diffusion Probabilistic Model (Ho et al., 2020) to denoise MNIST (Le Cun & Cortes, 2005), FMNIST (Xiao et al., 2017), KMNIST (Clanuwat et al., 2018), and CIFAR10 (Krizhevsky et al., 2009) images. |
| Researcher Affiliation | Academia | 1Department of Psychology, Princeton University, Princeton, USA 2Department of Computer Science, Princeton University, Princeton, USA 3Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany. |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Reproducibility. Code for reproducing all simulations of the optimal sampler is available at the following link: https://github.com/raja-marjieh/diffusion-sr.2 ... We used Brian Pulfer s Py Torch re-implementation of DDPM https://github.com/Brian Pulfer/ Papers Reimplementations |
| Open Datasets | Yes | Specifically, we trained a Denoising Diffusion Probabilistic Model (Ho et al., 2020) to denoise MNIST (Le Cun & Cortes, 2005), FMNIST (Xiao et al., 2017), KMNIST (Clanuwat et al., 2018), and CIFAR10 (Krizhevsky et al., 2009) images |
| Dataset Splits | No | To evaluate the sample quality from each trained model, we generated 6,000 images using the same number of steps as the model was trained on, and computed the Fr echet inception distance (FID; (Heusel et al., 2017)) between each set of generated images and the training set. (No specific training/validation/test split percentages or counts are provided, only that it's compared to the training set.) |
| Hardware Specification | Yes | For each dataset, this process took less than 2 hours to run on a single RTX 3080 Laptop GPU. |
| Software Dependencies | No | We used Brian Pulfer s Py Torch re-implementation of DDPM (https://github.com/Brian Pulfer/Papers Reimplementations). (While PyTorch is mentioned, a specific version number for PyTorch itself is not provided, nor for other key libraries.) |
| Experiment Setup | Yes | For this model, the noise at step t depends on the diffusion parameter βt = βmin + (βmax βmin) t/T. To investigate the effect of the noise schedule, we set βmin = 0.0001, βmax = 0.02 and retrained the model multiple times with a different number of total steps each time (T [50, 500]). |