reproducibilityindex.ai

General bounds on the quality of Bayesian coresets

Authors: Trevor Campbell

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The paper includes empirical validation of the main theoretical claims on two models that violate common assumptions made in the literature: a multimodal, unidentifiable Cauchy location model with a heavy-tailed prior, and an unidentifiable logistic regression model with a heavy-tailed prior and persisting posterior heavy tails. Experiments were performed on a computer with an Intel Core i7-8700K and 32GB of RAM. Figure 2: Importance-weighted coreset quality... Figure 3: Subsample-optimize coreset quality...
Researcher Affiliation	Academia	Trevor Campbell Department of Statistics University of British Columbia trevor@stat.ubc.ca
Pseudocode	Yes	Algorithm 1 Importance-weighted coreset construction and Algorithm 2 Scaled importance-weighted coreset construction and Algorithm 3 Subsample-optimize coreset construction are present on pages 4 and 6.
Open Source Code	No	From the Neur IPS Paper Checklist, section "5. Open access to data and code": "Answer: [No] Justification: There are no new algorithms presented in this work; the experiments involve only existing methods for which public code is available. The code is not central to the contributions of the paper."
Open Datasets	No	The paper specifies models and data generation processes (e.g., "Xn iid Cauchy(θ2, 1)") rather than referring to or providing access information for public datasets.
Dataset Splits	No	The paper mentions "validation experiments" but does not explicitly provide details on data splits (e.g., percentages or counts) for training, validation, or test sets.
Hardware Specification	Yes	Experiments were performed on a computer with an Intel Core i7-8700K and 32GB of RAM.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., "Python 3.8, PyTorch 1.9, and CUDA 11.1") needed to replicate the experiments.
Experiment Setup	Yes	Sampling probabilities pn for both models are set proportional to X2 n, thresholded to lie between 0.1/N and 10/N. (Figure 2 caption) and Sampling probabilities are uniform pn = 1/N, and coreset weights were optimized by nonnegative least squares for log-likelihoods discretized via samples from π [34, Eq. 4]. (Figure 3 caption).