reproducibilityindex.ai

Retrieval-Augmented Diffusion Models

Authors: Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, Björn Ommer

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As demonstrated by our experiments, simply swapping the database for one with different contents transfers a trained model post-hoc to a novel domain. The evaluation shows competitive performance on tasks which the generative model has not been trained on, such as class-conditional synthesis, zero-shot stylization or text-to-image synthesis without requiring paired text-image data.
Researcher Affiliation	Academia	Andreas Blattmann Robin Rombach Kaan Oktay Jonas Müller Björn Ommer LMU Munich, MCML & IWR, Heidelberg University, Germany
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/CompVis/retrieval-augmented-diffusion-models
Open Datasets	Yes	We train RDMs on the dogs-subset of Image Net [13] with i) Wiki Art [66] (RDM-WA), ii) MS-COCO [7] (RDM-COCO) and iii) 20M examples obtained by cropping images (see App. F.1) from Open Images [46] as train database Dtrain...
Dataset Splits	Yes	We evaluate their performance on the Image Net train- and validation-sets in Tab. 1, which shows RDM-OI to closely reach the performance of RDM-IN in CLIP-FID [48] and achieve more diverse results.
Hardware Specification	No	The paper states that the total amount of compute and type of resources used are detailed in the supplemental material, but no specific hardware models (e.g., GPU/CPU types) are mentioned in the main text.
Software Dependencies	No	The paper mentions software like 'Sca NN' and 'CLIP-Vi T-B/32' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	For hyperparameters, implementation and evaluation details cf. Sec. F. For Image Net samples are generated with m = 0.01, guidance with s = 2.0 and 100 DDIM steps for RDM and m = 0.05, guidance scale s = 3.0 and top-k = 2048 for RARM. On FFHQ we use s = 1.0 , m = 0.1.