SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery

Authors: Yezhen Cong, Samar Khanna, Chenlin Meng, Patrick Liu, Erik Rozi, Yutong He, Marshall Burke, David Lobell, Stefano Ermon

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our approach yields strong improvements over previous state-of-the-art techniques, both in terms of supervised learning performance on benchmark datasets (up to " 7%), and transfer learning performance on downstream remote sensing tasks, including land cover classification (up to " 14%) and semantic segmentation. Code and data are available on the project website: https://sustainlab-group.github.io/Sat MAE/
Researcher Affiliation Academia Yezhen Cong yzcong@stanford.edu Samar Khanna samar.khanna@stanford.edu Chenlin Meng Patrick Liu Erik Rozi Yutong He Marshall Burke David B. Lobell Stefano Ermon Stanford University
Pseudocode No The paper describes the MAE architecture and its modifications for temporal and multi-spectral data using descriptive text and diagrams (e.g., Figure 1), but it does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Code and data are available on the project website: https://sustainlab-group.github.io/Sat MAE/
Open Datasets Yes f Mo W Sentinel We create a new dataset based on the f Mo W RGB dataset. We collect all 13 frequency bands provided by Sentinel-2 (B1-12 and B8A) for the original f Mo W locations, at some of the same times as f Mo W images plus some extra times, for a total of 712,874 training images, 84,939 validation images, and 84,966 test images. More details are included in appendix A.1.
Dataset Splits Yes f Mo W Sentinel... for a total of 712,874 training images, 84,939 validation images, and 84,966 test images. (Section 5.1). NAIP... 244,471 training and 55,529 validation images. (Section 5.5). Big Earth Net... 354,196 images for training and 118,065 images for validation. (Section 5.5).
Hardware Specification Yes Pre-training for 50 epochs on 8 NVIDIA V100 GPUs.
Software Dependencies Yes Dataset is downloaded and processed with Python 3.8. The code to preprocess the data and reproduce the dataset will be released as part of our codebase. Our pre-trained model and downstream experiments are implemented in PyTorch 1.9.
Experiment Setup Yes Batch size 2048, learning rate 0.0001, weight decay 0.05, AdamW optimizer, warmup epochs 5, min learning rate 0.000001.