Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Authors: Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David B. Lobell, Stefano Ermon
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We describe the experiments for the tasks in section 3. Implementation details are in appendix A.1. |
| Researcher Affiliation | Collaboration | 1Stanford University, 2Stability AI, 3CZ Biohub |
| Pseudocode | No | The paper does not include pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | The project website can be found here: https://samar-khanna.github.io/Diffusion Sat/ |
| Open Datasets | Yes | Instead, we compile publicly available annotated satellite data and contribute a large, high-resolution generative dataset for satellite images. Detailed descriptions on how the caption is generated for each dataset are in the appendix. (i) f Mo W: Function Map of the World (f Mo W) Christie et al. (2018)... (ii) Satlas: Satlas Bastani et al. (2022)... (iii) Space Net: Spacenet Van Etten et al. (2018; 2021)... |
| Dataset Splits | Yes | We generate 10000 samples on the validation sets of f Mo W-RGB. |
| Hardware Specification | Yes | We use 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | All models are trained on half-precision and with gradient checkpointing, borrowing from the Diffusers (von Platen et al., 2022) library. |
| Experiment Setup | Yes | The text-to-image models are trained with a batch size of 128 for 100000 iterations, which we determined was sufficient for convergence. We choose a constant learning rate of 2e-6 with the Adam W optimizer. |