Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Automated Scalable Bayesian Inference via Hilbert Coresets
Authors: Trevor Campbell, Tamara Broderick
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test Hilbert coresets empirically on multivariate Gaussian inference, logistic regression, Poisson regression, and von Mises-Fisher mixture modeling with both real and synthetic data; these experiments show that Hilbert coresets provide high quality posterior approximations and a significant reduction in the computational cost of inference. |
| Researcher Affiliation | Academia | Trevor Campbell EMAIL Department of Statistics University of British Columbia Vancouver, BC V6T 1Z4, Canada Tamara Broderick EMAIL Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA |
| Pseudocode | Yes | Algorithm 1 IS: Hilbert coresets via importance sampling Algorithm 2 FW: Hilbert coresets via Frank Wolfe Algorithm 3 Bayesian Hilbert coresets with random projection |
| Open Source Code | No | The paper does not provide a direct link to a source-code repository, an explicit statement of code release for the described methodology, or mention that code is provided in supplementary materials. |
| Open Datasets | Yes | The Phishing2 data set consisted of N = 11,055 data points... https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html The Chem React3 data set consisted of N = 26,733 data points... http://komarix.org/ac/ds/ The Bike Trips4 data set consisted of N = 17,386 data points... http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset The Airport Delays5 data set consisted of N = 7,580 data points... Airport information from http://stat-computing.org/dataexpo/2009/the-data.html, with historical weather information from https://www.wunderground.com/history/. |
| Dataset Splits | Yes | For logistic regression, the Synthetic data set consisted of N = 10,000 data points (with 1,000 held out for testing) with covariate xn R2 sampled i.i.d. from N(0, I), and label yn { 1, 1} generated from the logistic likelihood with parameter ฮธ = [3, 3, 0]T . The Phishing2 data set consisted of N = 11,055 data points (with 1,105 held out for testing) each with D = 68 features. In this data set, each covariate corresponds to the features of a website, and the goal is to predict whether or not a website is a phishing site. The Chem React3 data set consisted of N = 26,733 data points (with 2,673 held out for testing) each with D = 10 features. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions methods like 'random walk Metropolis-Hastings' and 'Gibbs sampling' but does not specify the software implementations or their version numbers used. |
| Experiment Setup | Yes | For both logistic regression and Poisson regression, we used the Laplace approximation (Bishop, 2006, Section 4.4) as the weighting distribution หฯ in the Hilbert coreset, with the random projection dimension set to D = 500. Posterior inference in each of the 50 trials was conducted using random walk Metropolis-Hastings with an isotropic multivariate Gaussian proposal distribution. We simulated a total of 100,000 steps, with 50,000 warmup steps including proposal covariance adaptation with a target acceptance rate of 0.234, and thinning of the latter 50,000 by a factor of 5, yielding 10,000 posterior samples. For directional clustering the weighting distribution หฯ for the Hilbert coreset was constructed by finding maximum likelihood estimates of the cluster modes (หยตk)K k=1 and weights หฯ using the EM algorithm, and then setting หฯ to an independent product of approximate posterior conditionals... The random projection dimension was set to D = 500, and the number of clusters K was set to 6. Posterior inference in each of the 50 trials was conducted used Gibbs sampling (introducing auxiliary label variables for the data) with a total of 100,000 steps, with 50,000 warmup steps and thinning of the latter 50,000 by a factor of 5, yielding 10,000 posterior samples. |