Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lightning UQ Box: Uncertainty Quantification for Neural Networks
Authors: Nils Lehmann, Nina Maria Gottschling, Jakob Gawlikowski, Adam J. Stewart, Stefan Depeweg, Eric Nalisnick
JMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Lightning UQ Box works towards this goal by supporting configuration of experiments with simple configuration files, as well as the Lightning command line interface (CLI). For example, the required configurations to run a partially stochastic BNN or Deep Kernel Learning model based on the timm library Res Net18 implementation on the Euro SAT dataset from torchgeo is shown in Figure 2. And also "to adequately evaluate the efficacy of these methods for various applications, a common modeling framework is necessary to foster the reproducibility of experiments, provide a fair evaluation, and make UQ methods more easily accessible to various research domains." |
| Researcher Affiliation | Collaboration | Nils Lehmann EMAIL Data Science in Earth Observation, Technical University of Munich; Stefan Depeweg EMAIL Siemens AG; Eric Nalisnick EMAIL Johns Hopkins University |
| Pseudocode | No | The paper describes a software library and its design principles, but does not present any pseudocode or algorithm blocks for novel methods. |
| Open Source Code | Yes | Lightning UQ Box 1 aims to fill this gap... 1. Lightning UQ Box Git Hub repository and documentation |
| Open Datasets | Yes | For example, the required configurations to run a partially stochastic BNN or Deep Kernel Learning model based on the timm library Res Net18 implementation on the Euro SAT dataset from torchgeo is shown in Figure 2. ... (right) the same Res Net18 as Deep Kernel Learning model for training on the Euro SAT classification dataset from the geospatial Py Torch domain library Torch Geo (Stewart et al., 2022). |
| Dataset Splits | No | Figure 2 shows example YAML files for configuring models to train on the Euro SAT dataset, including a 'batch_size: 64'. However, the paper does not explicitly state the training, validation, and test splits (e.g., percentages or sample counts) used for the dataset. |
| Hardware Specification | No | The paper describes a software library and its functionalities, but does not provide specific details about the hardware (e.g., GPU models, CPU types) used for any experiments or development. |
| Software Dependencies | No | The paper mentions several software components like Py Torch, Py Torch Lightning, timm, and torchgeo, but does not provide specific version numbers for these dependencies, which are necessary for reproducible experiments. |
| Experiment Setup | Yes | Figure 2: Example YAML files that configure (left) a partially stochastic BNN based on a timm Res Net18 model implementation and (right) the same Res Net18 as Deep Kernel Learning model for training on the Euro SAT classification dataset from the geospatial Py Torch domain library Torch Geo (Stewart et al., 2022). The YAML files include hyperparameters such as 'num_mc_samples_train: 3', 'num_mc_samples_test: 25', 'batch_size: 64', 'max_epochs: 40', 'gradient_clip_val: 1.0', and 'accumulate_grad_batches: 2'. |