Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LoRA-EnVar: Parameter-Efficient Hybrid Ensemble Variational Assimilation for Weather Forecasting

Authors: Yi Xiao, Hang Fan, Kun Chen, Ye Cao, Ben Fei, Wei Xue, LEI BAI

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate Lo RA-En Var in high-resolution assimilation settings using the Feng Wu forecast model and simulated observations from ERA5 reanalysis. Experimental results show that Lo RA-En Var significantly improves assimilation accuracy over models assuming static background error distribution and achieves comparable or better performance than full finetuning while reducing the number of trainable parameters by three orders of magnitude.
Researcher Affiliation Collaboration Yi Xiao Tsinghua University, Beijing Shanghai Artificial Intelligence Laboratory, Shanghai EMAIL Hang Fan Columbia University, New York EMAIL Kun Chen Fudan University, Shanghai Shanghai Artificial Intelligence Laboratory, Shanghai EMAIL Ye Cao Tsinghua University, Beijing EMAIL Ben Fei B The Chinese University of Hong Kong, Hong Kong Shanghai Artificial Intelligence Laboratory, Shanghai EMAIL Wei Xue B Tsinghua University, Beijing EMAIL Lei Bai Shanghai Artificial Intelligence Laboratory, Shanghai EMAIL
Pseudocode Yes Algorithm 1 Lo RA-En Var Assimilation Step Algorithm 2 NMC-Based Training Set Construction
Open Source Code Yes The code is available at https://github.com/xiaoyi018/AI-Var DA.
Open Datasets Yes We validate Lo RA-En Var in high-resolution assimilation settings using the Feng Wu forecast model and simulated observations from ERA5 reanalysis. ...Feng Wu is a learning-based medium-range weather forecasting model trained on the ERA5 reanalysis dataset. ...We conduct additional experiments using real-world observations from the NOAA GDAS (Global Data Assimilation System) prepbufr dataset.
Dataset Splits Yes After finetuning, we perform variational data assimilation using simulated observations sampled from ERA5 analysis fields at 1000 random points. ...we conduct a cyclic assimilation and forecasting experiment over a full month, starting from 00:00 UTC on January 1st, 2022. ...conducting cyclic assimilation experiments with 250, 500, 1000, and 2000 observations per cycle, corresponding to approximately 0.024%, 0.048%, 0.096%, and 0.19% of the total grid points, respectively. ...At each assimilation time, we randomly exclude 15% of the available observational stations from assimilation. The remaining 85% of observations are used to perform data assimilation as usual.
Hardware Specification Yes Experiments are performed on a single NVIDIA A100 GPU, with each assimilation cycle requiring approximately 14 seconds for finetuning and 10 seconds for assimilation.
Software Dependencies No The paper mentions "PyTorch" [59] and "Adam optimizer [62]" but does not specify particular version numbers for these or other key software components.
Experiment Setup Yes The learning rate for offline VAE training is set to 10 4. During finetuning, we use a learning rate of 10 5 for full model finetuning, and 10 2 for Lo RA-based finetuning. In both cases, finetuning is performed for five epochs. Both training and finetuning procedures use the Adam optimizer [62]. Unless otherwise stated, the Lo RA rank is fixed at 2 across all experiments (see the Appendix for ablation studies). Background Ensemble Generation We follow the framework illustrated in Figure 1, using an ensemble size of 8 (see the Appendix for experiments with a different ensemble size).