SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery
Authors: Yezhen Cong, Samar Khanna, Chenlin Meng, Patrick Liu, Erik Rozi, Yutong He, Marshall Burke, David Lobell, Stefano Ermon
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach yields strong improvements over previous state-of-the-art techniques, both in terms of supervised learning performance on benchmark datasets (up to " 7%), and transfer learning performance on downstream remote sensing tasks, including land cover classification (up to " 14%) and semantic segmentation. Code and data are available on the project website: https://sustainlab-group.github.io/Sat MAE/ |
| Researcher Affiliation | Academia | Yezhen Cong yzcong@stanford.edu Samar Khanna samar.khanna@stanford.edu Chenlin Meng Patrick Liu Erik Rozi Yutong He Marshall Burke David B. Lobell Stefano Ermon Stanford University |
| Pseudocode | No | The paper describes the MAE architecture and its modifications for temporal and multi-spectral data using descriptive text and diagrams (e.g., Figure 1), but it does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data are available on the project website: https://sustainlab-group.github.io/Sat MAE/ |
| Open Datasets | Yes | f Mo W Sentinel We create a new dataset based on the f Mo W RGB dataset. We collect all 13 frequency bands provided by Sentinel-2 (B1-12 and B8A) for the original f Mo W locations, at some of the same times as f Mo W images plus some extra times, for a total of 712,874 training images, 84,939 validation images, and 84,966 test images. More details are included in appendix A.1. |
| Dataset Splits | Yes | f Mo W Sentinel... for a total of 712,874 training images, 84,939 validation images, and 84,966 test images. (Section 5.1). NAIP... 244,471 training and 55,529 validation images. (Section 5.5). Big Earth Net... 354,196 images for training and 118,065 images for validation. (Section 5.5). |
| Hardware Specification | Yes | Pre-training for 50 epochs on 8 NVIDIA V100 GPUs. |
| Software Dependencies | Yes | Dataset is downloaded and processed with Python 3.8. The code to preprocess the data and reproduce the dataset will be released as part of our codebase. Our pre-trained model and downstream experiments are implemented in PyTorch 1.9. |
| Experiment Setup | Yes | Batch size 2048, learning rate 0.0001, weight decay 0.05, AdamW optimizer, warmup epochs 5, min learning rate 0.000001. |