Precipitation Downscaling with Spatiotemporal Video Diffusion
Authors: Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher S. Bretherton, Stephan Mandt
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our approach on FV3GFS output, an established large-scale global atmosphere model, and compare it against six state-of-the-art baselines. Our analysis, capturing CRPS, MSE, precipitation distributions, and qualitative aspects using California and the Himalayas as examples, establishes our method as a new standard for data-driven precipitation downscaling. |
| Researcher Affiliation | Collaboration | Prakhar Srivastava1 Ruihan Yang1 Gavin Kerrigan1 Gideon Dresdner2 Jeremy Mc Gibbon2 Christopher Bretherton2 Stephan Mandt1 1University of California, Irvine 2Allen Institute for AI, Seattle |
| Pseudocode | Yes | Algorithm 1: Training STVD and Algorithm 2: Sampling STVD |
| Open Source Code | Yes | Code : https://github.com/mandt-lab/STVD. The code for our model is available at https://github.com/mandt-lab/STVD. |
| Open Datasets | Yes | Our dataset derives from an 11-member initial condition ensemble of 13-month simulations using a global atmosphere model, FV3GFS, run at 25 km resolution and forced by climatological sea surface temperatures and sea ice. The first month of each simulation is discarded to allow the simulations to spin up and meteorologically diverge, effectively providing 11 years of reference data (of which first 10 years are used for training and the last year for validation). FV3GFS, developed by the National Oceanic and Atmospheric Administration (NOAA), is a version of NOAA s operational global weather forecast model ([77, 10]). |
| Dataset Splits | Yes | effectively providing 11 years of reference data (of which first 10 years are used for training and the last year for validation). |
| Hardware Specification | Yes | We optimize our model end-to-end with a single diffusion loss using Adam [28] with an initial learning rate of 1 10 4, decaying to 5 10 7 with cosine annealing during training, executed on an NVidia RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions software components like 'Adam' and 'DDIM sampling' and refers to v-parametrization, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, or other libraries). |
| Experiment Setup | Yes | Our approach (STVD) trains on 5 consecutive frames that are downscaled jointly. We optimize our model end-to-end with a single diffusion loss using Adam [28] with an initial learning rate of 1 10 4, decaying to 5 10 7 with cosine annealing during training, executed on an NVidia RTX A6000 GPU. The diffusion model is trained using v-parametrization [59], with a fixed diffusion depth (N = 1400). Random tiles extracted from the cube-sphere representation of Earth, with dimensions 384 in high-resolution and 48 in low-resolution, are used during training. We train for one million steps, requiring approximately 7 days on a single node (slightly less for ablations). We use a batch size of one, apply a logarithmic transformation to precipitation states, and normalize to the range [ 1, 1]. During testing, we employ DDIM sampling with 30 steps on an Exponential Moving Average (EMA) variant of our model (for full frame size), with a decay rate of 0.995. |