reproducibilityindex.ai

Effective Bayesian Heteroscedastic Regression with Deep Neural Networks

Authors: Alexander Immer, Emanuele Palumbo, Alexander Marx, Julia Vogt

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of the natural parameterization compared to the mean-variance (naive) one, and empirical Bayes (EB) to optimizing a single regularization parameter using a grid search on the validation set (GS), and the MAP prediction vs a Bayesian posterior predictive (PP) in comparison to state-of-the-art baselines on three experimental settings: the UCI regression benchmark [Hernandez Lobato and Adams, 2015], which is also well-established for heteroscedastic regression [Seitzer et al., 2022, Stirn et al., 2023], the recently introduced CRISPR-Cas13 gene expression datasets [Stirn et al., 2023], and our proposed heteroscedastic image-regression dataset (cf. Problem 2.1) in three noise variants.
Researcher Affiliation	Academia	1Department of Computer Science, ETH Zurich, Switzerland 2Max Planck Institute for Intelligent Systems, Tübingen, Germany 3AI Center, ETH Zurich, Switzerland
Pseudocode	Yes	Algorithm 1 Optimization of Heteroscedastic Regression Models
Open Source Code	Yes	Code at https://github.com/aleximmer/heteroscedastic-nn.
Open Datasets	Yes	We evaluate the effectiveness of the natural parameterization compared to the mean-variance (naive) one, and empirical Bayes (EB) to optimizing a single regularization parameter using a grid search on the validation set (GS), and the MAP prediction vs a Bayesian posterior predictive (PP) in comparison to state-of-the-art baselines on three experimental settings: the UCI regression benchmark [Hernandez Lobato and Adams, 2015], which is also well-established for heteroscedastic regression [Seitzer et al., 2022, Stirn et al., 2023], the recently introduced CRISPR-Cas13 gene expression datasets [Stirn et al., 2023], and our proposed heteroscedastic image-regression dataset (cf. Problem 2.1) in three noise variants.
Dataset Splits	Yes	For all methods using grid-search, we first split the training data into a 90/10 train-validation split.
Hardware Specification	Yes	The training was done 5 times (different seeds) per model-dataset pair to estimate mean and standard error and were run on a computing cluster with V100 and A100 NVIDIA GPUs.
Software Dependencies	No	The paper mentions software like 'Py Torch implementation from Krishnan et al. [2022]', 'laplace-torch package [Daxberger et al., 2021]', 'automatic second-order differentiation library [asdl; Osawa, 2021]', 'pytorch [Paszke et al., 2017]', and 'jax [Bradbury et al., 2018]', but does not provide specific version numbers for these general software dependencies.
Experiment Setup	Yes	We train all models, except for the VI and MC-Dropout baselines, with Adam optimizer using a batch size of 256 for 5000 epochs and an initial learning rate of 10-2 that is decayed to 10-5 using a cosine schedule.