Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Covariances, Robustness, and Variational Bayes

Authors: Ryan Giordano, Tamara Broderick, Michael I. Jordan

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we demonstrate that our methods are simple, general, and fast, providing accurate posterior uncertainty estimates and robustness measures with runtimes that can be an order of magnitude faster than MCMC. Keywords: Variational Bayes; Bayesian robustness; Mean ﬁeld approximation; Linear response theory; Laplace approximation; Automatic differentiation
Researcher Affiliation	Academia	Ryan Giordano EMAIL Department of Statistics, UC Berkeley 367 Evans Hall, UC Berkeley Berkeley, CA 94720; Tamara Broderick EMAIL Department of EECS, MIT 77 Massachusetts Ave., 38-401 Cambridge, MA 02139; Michael I. Jordan EMAIL Department of Statistics and EECS, UC Berkeley 367 Evans Hall, UC Berkeley Berkeley, CA 94720
Pseudocode	No	The paper describes methods and derivations using mathematical formulations and textual descriptions, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code and instructions to reproduce the results of this section can be found in the git repository rgiordan/Covariances_Robustness_VBPaper.
Open Datasets	Yes	The data and Stan implementations themselves can be found on the Stan website (Stan Team, 2017) as well as in Appendix F. To assess the accuracy of each model, we report means and standard deviations for each of Stan s model parameters as calculated by Stan s MCMC and ADVI algorithms and a Laplace approximation, and we report the standard deviations as calculated by Cov LR q0 (g (θ)).
Dataset Splits	No	The paper describes data preparation and subsampling for the Criteo dataset, including randomly choosing 5000 distinct advertisers and subsampling them to no more than 20 rows each, resulting in 61895 total rows. However, it does not specify explicit training, validation, or test splits for any of the datasets used in the experiments.
Hardware Specification	No	The paper discusses runtimes and computational costs but does not specify any particular hardware (e.g., CPU, GPU models, or cloud configurations) used for running the experiments.
Software Dependencies	Yes	We used Stan (Stan Team, 2015), and to calculate the MFVB, Laplace, and LRVB estimates we used our own Python code using numpy, scipy, and autograd (Jones et al., 2001; Maclaurin et al., 2015). As described in Section 5.3.3, the MAP estimator did not estimate Ep0 [g (θ)] very well, so we do not report standard deviations or sensitivity measures for the Laplace approximations. The Stan Modeling Language Users Guide and Reference Manual, Version 2.8.0, is cited for Stan.
Experiment Setup	Yes	We found M = 10000 to be more than adequate for our present purposes of illustration. (Section 5.1.2); In the examples we consider here, we found that the relatively modest M = 10 satisﬁes this condition and provides sufﬁciently accurate results. (Section 5.2.1); The prior parameters were taken to be σ 2 µ = 0.010, σ 2 β = 0.100, βτ = 3.000. (Section 5.3.1); We used the re-parameterization trick and four points of Gauss-Hermite quadrature to estimate this integral for each observation. (Section 5.3.2)