Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Retrieval-Augmented Diffusion Models
Authors: Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, Björn Ommer
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As demonstrated by our experiments, simply swapping the database for one with different contents transfers a trained model post-hoc to a novel domain. The evaluation shows competitive performance on tasks which the generative model has not been trained on, such as class-conditional synthesis, zero-shot stylization or text-to-image synthesis without requiring paired text-image data. |
| Researcher Affiliation | Academia | Andreas Blattmann Robin Rombach Kaan Oktay Jonas Müller Björn Ommer LMU Munich, MCML & IWR, Heidelberg University, Germany |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/CompVis/retrieval-augmented-diffusion-models |
| Open Datasets | Yes | We train RDMs on the dogs-subset of Image Net [13] with i) Wiki Art [66] (RDM-WA), ii) MS-COCO [7] (RDM-COCO) and iii) 20M examples obtained by cropping images (see App. F.1) from Open Images [46] as train database Dtrain... |
| Dataset Splits | Yes | We evaluate their performance on the Image Net train- and validation-sets in Tab. 1, which shows RDM-OI to closely reach the performance of RDM-IN in CLIP-FID [48] and achieve more diverse results. |
| Hardware Specification | No | The paper states that the total amount of compute and type of resources used are detailed in the supplemental material, but no specific hardware models (e.g., GPU/CPU types) are mentioned in the main text. |
| Software Dependencies | No | The paper mentions software like 'Sca NN' and 'CLIP-Vi T-B/32' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For hyperparameters, implementation and evaluation details cf. Sec. F. For Image Net samples are generated with m = 0.01, guidance with s = 2.0 and 100 DDIM steps for RDM and m = 0.05, guidance scale s = 3.0 and top-k = 2048 for RARM. On FFHQ we use s = 1.0 , m = 0.1. |