DiffusionSat: A Generative Foundation Model for Satellite Imagery

Authors: Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David B. Lobell, Stefano Ermon

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We describe the experiments for the tasks in section 3. Implementation details are in appendix A.1.
Researcher Affiliation Collaboration 1Stanford University, 2Stability AI, 3CZ Biohub
Pseudocode No The paper does not include pseudocode or a clearly labeled algorithm block.
Open Source Code Yes The project website can be found here: https://samar-khanna.github.io/Diffusion Sat/
Open Datasets Yes Instead, we compile publicly available annotated satellite data and contribute a large, high-resolution generative dataset for satellite images. Detailed descriptions on how the caption is generated for each dataset are in the appendix. (i) f Mo W: Function Map of the World (f Mo W) Christie et al. (2018)... (ii) Satlas: Satlas Bastani et al. (2022)... (iii) Space Net: Spacenet Van Etten et al. (2018; 2021)...
Dataset Splits Yes We generate 10000 samples on the validation sets of f Mo W-RGB.
Hardware Specification Yes We use 8 NVIDIA A100 GPUs.
Software Dependencies No All models are trained on half-precision and with gradient checkpointing, borrowing from the Diffusers (von Platen et al., 2022) library.
Experiment Setup Yes The text-to-image models are trained with a batch size of 128 for 100000 iterations, which we determined was sufficient for convergence. We choose a constant learning rate of 2e-6 with the Adam W optimizer.