Diffusion Probabilistic Fields

Authors: Peiye Zhuang, Samira Abnar, Jiatao Gu, Alex Schwing, Joshua M. Susskind, Miguel Ángel Bautista

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that, while using the same denoising network, DPF effectively deals with different modalities like 2D images and 3D geometry, in addition to modeling distributions over fields defined on non-Euclidean metric spaces. (Abstract) and 4 EXPERIMENTAL RESULTS We present results on multiple domains: 2D image data, 3D geometry data, and spherical data. Across all domains we use the same score network architecture. (Section 4)
Researcher Affiliation Collaboration Peiye Zhuang1 , Samira Abnar2 , Jiatao Gu2 , Alexander G. Schwing3, Joshua M. Susskind2 , Miguel Ángel Bautista2 1Stanford University 2Apple 3University of Illinois at Urbana-Champaign 1peiye@stanford.edu 2{abnar, jgu32, jsusskind, mbautistamartin}@apple.com 3aschwing@illinois.edu
Pseudocode Yes Algorithm 1 Training (page 4) and Algorithm 2 Sampling (page 5)
Open Source Code No We provide links to the public implementations that can be used to replicate our results in Sect. A and Sect. B, as well as describing all training parameters in Tab. 6. All of the datasets we report results on are public and can be freely downloaded. (Section 8 Reproducibility Statement). Code available at https://github.com/rosinality/denoising-diffusion-pytorch. (Section A). Finally, for Perceiver IO we use a modification of the public repository.5 https://huggingface.co/docs/transformers/model_doc/perceiver (Section B). The provided links refer to a third-party DDPM implementation and a component library modification, not the specific DPF source code.
Open Datasets Yes We present empirical results on two standard image benchmarks: Celeb A-HQ (Karras et al., 2018) 642 and CIFAR-10 (Krizhevsky, 2009) 322. (Section 4.1). We now turn to the task of modeling distributions over 3D objects and present results on the Shape Net dataset (Chang et al., 2015). (Section 4.2). To uniformly sample points in S2 we use the Driscoll-Healy algorithm (Driscoll & Healy, 1994) and sample points at a resolution of 322 and 642 for spherical MNIST (Le Cun et al., 1998) and AFHQ (Choi et al., 2020) data, respectively. (Section 4.3). All of the datasets we report results on are public and can be freely downloaded. (Section 8)
Dataset Splits No No explicit mention of training, validation, or test dataset splits (e.g., percentages, counts, or specific predefined splits) in the paper.
Hardware Specification Yes We use 8 A100 GPUs for all experiments.
Software Dependencies No We use an Adam (Kingma & Ba, 2015) optimizer during training. We set the learning rate to 1e 4. We set the batch size to 16 for all image datasets. We use 8 A100 GPUs for all experiments. Finally, for Perceiver IO we use a modification of the public repository.5 https://huggingface.co/docs/transformers/model_doc/perceiver (Section B). The paper mentions software components and an optimizer but does not specify their version numbers (e.g., PyTorch version).
Experiment Setup Yes We set the learning rate to 1e 4. We set the batch size to 16 for all image datasets. (Section B). Perceiver IO settings for all experiments with quantitative evaluation are show in Tab. 6 [should be Tab. 7 according to context in paper]. (Section B). Table 7: Hyperparameters and settings for DPF on different datasets. (Table 7)