DiffusionSat: A Generative Foundation Model for Satellite Imagery
Authors: Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David B. Lobell, Stefano Ermon
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We describe the experiments for the tasks in section 3. Implementation details are in appendix A.1. |
| Researcher Affiliation | Collaboration | 1Stanford University, 2Stability AI, 3CZ Biohub |
| Pseudocode | No | The paper does not include pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | The project website can be found here: https://samar-khanna.github.io/Diffusion Sat/ |
| Open Datasets | Yes | Instead, we compile publicly available annotated satellite data and contribute a large, high-resolution generative dataset for satellite images. Detailed descriptions on how the caption is generated for each dataset are in the appendix. (i) f Mo W: Function Map of the World (f Mo W) Christie et al. (2018)... (ii) Satlas: Satlas Bastani et al. (2022)... (iii) Space Net: Spacenet Van Etten et al. (2018; 2021)... |
| Dataset Splits | Yes | We generate 10000 samples on the validation sets of f Mo W-RGB. |
| Hardware Specification | Yes | We use 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | All models are trained on half-precision and with gradient checkpointing, borrowing from the Diffusers (von Platen et al., 2022) library. |
| Experiment Setup | Yes | The text-to-image models are trained with a batch size of 128 for 100000 iterations, which we determined was sufficient for convergence. We choose a constant learning rate of 2e-6 with the Adam W optimizer. |