Going beyond Compositions, DDPMs Can Produce Zero-Shot Interpolations

Authors: Justin Deschenaux, Igor Krawczuk, Grigorios Chrysos, Volkan Cevher

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the interpolation abilities of DDPMs trained on examples with one factor of interest in Section 5. We demonstrate this ability on real-world datasets, filtered to retain examples with clear attributes only, as depicted in Figure 1 (right). Importantly, the training samples are highly separated in our experiments.
Researcher Affiliation Academia 1Department of Computer Science, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 2LIONS, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 3Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available on Git Hub.
Open Datasets Yes Real-world datasets like Celeb A, despite their discrete labels, contain some diversity in attributes. For instance, the Celeb A dataset contains clearly as well as mildly smiling faces, in the sense of Section 4.1. We train Efficient Net classifiers (Tan & Le, 2021), following Okawa et al. (2023).
Dataset Splits No The paper describes the process for filtering the Celeb A dataset to create extreme examples for training the DDPMs, and mentions a 'validation set' for auxiliary classifiers (Appendix D.2) and a 'held-out' manually labeled set (Appendix D.1), but it does not provide explicit train/test/validation splits for the main DDPM training data itself.
Hardware Specification Yes Training for 250k steps ranged from 19 to 21 hours on A100 40Gi B or RTX4090 respectively.
Software Dependencies No The paper mentions using the Adam optimizer and details training parameters, but it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We train the diffusion model for 250k steps with learned denoising process variance, a learning rate of 1e 4, no weight decay, an EMA rate of 0.9999, 4000 diffusion steps, and the cosine noise schedule presented in Equation (10), well-suited for 64 64 images.