Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Authors: Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Sussman Grathwohl
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach in settings from 2D data to high-resolution text-to-image generation. and 5 Experiments. |
| Researcher Affiliation | Collaboration | 1MIT 2Google Deepmind. |
| Pseudocode | Yes | Algorithm 1 Annealed MCMC |
| Open Source Code | No | Project webpage: https://energy-based-model.github.io/reduce-reuse-recycle/. This is a project overview page, not an explicit code repository or statement of code release for the methodology. |
| Open Datasets | Yes | We train our models on a dataset of images containing between 1 and 5 examples of various shapes taken from CLEVR (Johnson et al., 2017). and Next, we train unconditional diffusion models and a noise-conditioned classifier on Image Net. |
| Dataset Splits | No | The paper mentions using CLEVR and Image Net datasets but does not specify exact train/validation/test split percentages, sample counts, or detailed splitting methodology. |
| Hardware Specification | Yes | 10 minutes on a 8 TPUv2 cores, 8 hours on 8 TPUv2 cores, 3 days on 16 TPUv2 cores, one week on an internal text/image dataset consisting of 400 million images using 32 TPUv3 cores. |
| Software Dependencies | No | The paper mentions software like the 'Adam optimizer' but does not provide specific version numbers for any key software components or libraries. |
| Experiment Setup | Yes | For synthetic datasets, we train both score and energy based diffusion models using a small residual MLP model with 4 residual blocks, with a internal hidden dimension of 128 dimensions. We train models for 15000 iterations (10 minutes on a 8 TPUv2 cores) using the Adam optimizer with learning rate of 1e-3, and train diffusion models on 100 discrete timesteps with linear schedule of β values. |