Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis

Authors: Yutong He, Dingjie Wang, Nicholas Lai, William Zhang, Chenlin Meng, Marshall Burke, David Lobell, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our model attains photo-realistic sample quality and outperforms competing baselines on a key downstream task object counting particularly in geographic locations where conditions on the ground are changing rapidly. To demonstrate the effectiveness of our model, we collect a large-scale paired satellite image dataset... To evaluate our model s performance, we compare to state-of-the-art methods... 5 Experiments
Researcher Affiliation Academia Yutong He Dingjie Wang Nicholas Lai William Zhang Chenlin Meng Marshall Burke David B. Lobell Stefano Ermon Stanford University {kellyyhe, daviddw, nicklai, wxyz, chenlin, ermon}@cs.stanford.edu {mburke, dlobell}@stanford.edu
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code No The paper does not provide any explicit statement about open-sourcing code or a link to a code repository.
Open Datasets Yes Texas Housing Dataset We collect a dataset consisting of 286717 houses and their surrounding neighborhoods from Core Logic tax and deed database that have an effective year built between 2014 and 2017 in Texas, US... We source high resolution images from NAIP (1m GSD) and low resolution images from Sentinel-2 (10m GSD) and only extract RGB bands from Google Earth Engine [14]. FMo W-Sentinel2 Crop Field Dataset We derive this dataset from the crop field category in Functional Map of the World (f Mo W) dataset [9] for the task of generating images over a greater number of time steps.
Dataset Splits No The paper mentions training and testing sets, but does not explicitly provide details for a separate validation dataset split. For Texas Housing: 'We reserve 14101 houses from 20 randomly selected zip codes as the testing set and use the remaining 272616 houses from the other 759 zip codes as the training set.' For FMoW-Sentinel2: 'We reserve 237 locations as the testing set and the remaining 1515 locations as the training set.'
Hardware Specification Yes We train each model to convergence, which takes around 4-5 days on 1 NVIDIA Titan XP GPU.
Software Dependencies No The paper mentions using 'Adam optimizer' and implicitly deep learning frameworks but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We choose H = W = 256, C = 3 (the concatenated RGB bands of the input images), Cfea = 256, m = 3, n = 14 and λ = 100 for all of our experiments. We use non-saturating conditional GAN loss for G and R1 penalty for D, which has the same network structure as the discriminator in [19, 2]. We train all models using Adam optimizer with learning rate 2 10 3, β0 = 0, β1 = 0.99, ϵ = 10 8.