Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Automated Scalable Bayesian Inference via Hilbert Coresets

Authors: Trevor Campbell, Tamara Broderick

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test Hilbert coresets empirically on multivariate Gaussian inference, logistic regression, Poisson regression, and von Mises-Fisher mixture modeling with both real and synthetic data; these experiments show that Hilbert coresets provide high quality posterior approximations and a significant reduction in the computational cost of inference.
Researcher Affiliation	Academia	Trevor Campbell EMAIL Department of Statistics University of British Columbia Vancouver, BC V6T 1Z4, Canada Tamara Broderick EMAIL Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA
Pseudocode	Yes	Algorithm 1 IS: Hilbert coresets via importance sampling Algorithm 2 FW: Hilbert coresets via Frank Wolfe Algorithm 3 Bayesian Hilbert coresets with random projection
Open Source Code	No	The paper does not provide a direct link to a source-code repository, an explicit statement of code release for the described methodology, or mention that code is provided in supplementary materials.
Open Datasets	Yes	The Phishing2 data set consisted of N = 11,055 data points... https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html The Chem React3 data set consisted of N = 26,733 data points... http://komarix.org/ac/ds/ The Bike Trips4 data set consisted of N = 17,386 data points... http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset The Airport Delays5 data set consisted of N = 7,580 data points... Airport information from http://stat-computing.org/dataexpo/2009/the-data.html, with historical weather information from https://www.wunderground.com/history/.
Dataset Splits	Yes	For logistic regression, the Synthetic data set consisted of N = 10,000 data points (with 1,000 held out for testing) with covariate xn R2 sampled i.i.d. from N(0, I), and label yn { 1, 1} generated from the logistic likelihood with parameter θ = [3, 3, 0]T . The Phishing2 data set consisted of N = 11,055 data points (with 1,105 held out for testing) each with D = 68 features. In this data set, each covariate corresponds to the features of a website, and the goal is to predict whether or not a website is a phishing site. The Chem React3 data set consisted of N = 26,733 data points (with 2,673 held out for testing) each with D = 10 features.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions methods like 'random walk Metropolis-Hastings' and 'Gibbs sampling' but does not specify the software implementations or their version numbers used.
Experiment Setup	Yes	For both logistic regression and Poisson regression, we used the Laplace approximation (Bishop, 2006, Section 4.4) as the weighting distribution ˆπ in the Hilbert coreset, with the random projection dimension set to D = 500. Posterior inference in each of the 50 trials was conducted using random walk Metropolis-Hastings with an isotropic multivariate Gaussian proposal distribution. We simulated a total of 100,000 steps, with 50,000 warmup steps including proposal covariance adaptation with a target acceptance rate of 0.234, and thinning of the latter 50,000 by a factor of 5, yielding 10,000 posterior samples. For directional clustering the weighting distribution ˆπ for the Hilbert coreset was constructed by finding maximum likelihood estimates of the cluster modes (ˆµk)K k=1 and weights ˆω using the EM algorithm, and then setting ˆπ to an independent product of approximate posterior conditionals... The random projection dimension was set to D = 500, and the number of clusters K was set to 6. Posterior inference in each of the 50 trials was conducted used Gibbs sampling (introducing auxiliary label variables for the data) with a total of 100,000 steps, with 50,000 warmup steps and thinning of the latter 50,000 by a factor of 5, yielding 10,000 posterior samples.