Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

General Uncertainty Estimation with Delta Variances

Authors: Simon Schmitt, John Shawe-Taylor, Hado van Hasselt

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To empirically study the Delta Variance we build on the stateof-the-art Graph Cast weather forecasting system (Lam et al. 2023)... We assess the Epistemic Variance predictions on 5 years of hold-out data using multiple metrics such as the correlation between predicted variance and prediction error and the likelihood of the quantities of interest. Empirically Delta Variances with a diagonal Fisher approximation yield competitive results at lower computational cost see Figure 3.
Researcher Affiliation Collaboration 1 Deep Mind 2 University College London, UK
Pseudocode No The paper describes methods and derivations in paragraph form and mathematical equations. It does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No We build on the state-of-the-art Graph Cast weather prediction system...Training data ranges from 1979-2013 with validation data from 2014-2017 and holdout data from 2018-2021 resulting in about 100 GB of weather data. While the paper cites the Graph Cast system (Lam et al. 2023), it does not provide explicit access information (link, DOI, specific repository) for the *specific data* used in their experiments.
Dataset Splits Yes Training data ranges from 1979-2013 with validation data from 2014-2017 and holdout data from 2018-2021 resulting in about 100 GB of weather data.
Hardware Specification No The paper does not specify any particular hardware (GPU, CPU, TPU models) used for training or inference. It only mentions, 'To save resources we retrain the model for a grid size of 4 degrees and reduce the number of layers and latents each by factor a of 2.'
Software Dependencies No The paper mentions the use of 'any auto-differentiation framework' and the 'Graph Cast weather forecasting system (Lam et al. 2023)' but does not provide specific version numbers for any software dependencies or libraries used in their implementation.
Experiment Setup Yes To save resources we retrain the model for a grid size of 4 degrees and reduce the number of layers and latents each by factor a of 2. Finally we skip the fine-tuning curriculum for simplicity. In our experiments we optimize the coefficients of this linear combination using gradient descent to improve the loglikelihood or correlation on a small set of held-out validation data.