Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
Authors: Yi Xiao, Qilong Jia, Kun Chen, LEI BAI, Wei Xue
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the Feng Wu weather forecasting model, VAE-Var outperforms Diff DA and two traditional algorithms (interpolation and 3DVar) in terms of assimilation accuracy in sparse observational contexts, and is capable of assimilating real-world GDAS prepbufr observations over a year. ... Figures 3 and 4 present the RMSE (root mean square errors) of the analysis states at various time steps to assess assimilation accuracy. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Technology, Tsinghua University 2Shanghai Artificial Intelligence Laboratory EMAIL EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Training Set Construction Algorithm 2 VAE-Var Assimilation Algorithm 3 Cyclic Forecasting and Assimilation with VAE-Var Algorithm 4 Logarithmic Interpolation Matrix Construction Algorithm 5 Observation Operator Implementation in Py Torch |
| Open Source Code | Yes | The code of VAE-Var is available at https://github.com/xiaoyi018/VAE-Var. |
| Open Datasets | Yes | We sincerely acknowledge the European Centre for Medium-Range Weather Forecasts (ECMWF) for providing the ERA5 dataset and the National Center for Environmental Prediction (NCEP) for providing the GDAS dataset, which are instrumental in this study. Their efforts in data collection, archiving, and dissemination are greatly appreciated. Additionally, all the datasets we use, including ERA5 and GDAS prepbufr, are available online. |
| Dataset Splits | Yes | The six-hour forecasting model is trained using the ERA5 dataset from 1979 to 2015. ... We use ERA5 reanalysis data from 1979 to 2015 to train the VAE model... The system is simulated in an autoregressive manner for 15 days, starting from January 1, 2022. ... We select the year 2017 for conducting the assimilation experiment because the completeness of the observational data is highest for that year. |
| Hardware Specification | Yes | For example, on a single A100 GPU, one cycle of assimilation takes approximately 18 seconds. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'torch-harmonic library', but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The loss function consists of two components: the reconstruction loss (Lossrec) and the Kullback-Leibler (KL) divergence (Loss KL). ... a hyperparameter σ is introduced to balance the reconstruction loss and the KL divergence, and the total loss is expressed as Loss = 1 σ2 Lossrec + Loss KL. ... the loss weight σ is set to 2.0. ... The observation covariance matrix R is assumed to be diagonal, with the square root of each entry set to 0.1 times the standard deviation of the respective variable. The parameter λ is set to 4.0. ... the assimilation cycle T equals six hours |