Evaluating Bayesian Models with Posterior Dispersion Indices
Authors: Alp Kucukelbir, Yixin Wang, David M. Blei
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show how a PDI identifies patterns of model mismatch in three real data examples: voting preferences, supermarket shopping, and population genetics. and Section 3 presents an empirical study of model mismatch in three real-world examples: voting preferences, supermarket shopping, and population genetics. |
| Researcher Affiliation | Academia | 1Columbia University, New York City, USA. |
| Pseudocode | Yes | Algorithm 1: Calculating WAPDI. |
| Open Source Code | Yes | Figure 3. PDI and log predictive accuracy of each president under a mixture of three negative binomials model. Presidents sorted by PDI. The closer to zero, the better. (Code in supplement.) and Figure 4. Not all predictive probabilities are created equal. The translucent curves are the two likelihoods multiplied by the posterior (cropped). The posterior predictives p.x1 j x/ and p.x2 j x/ for each datapoint is the area under the curve. While both datapoints have the same predictive accuracy, the likelihood for x2 has higher variance under the posterior; x2 is more sensitive to the spread of the posterior than x1. WAPDI captures this effect. (Code in supplement.) |
| Open Datasets | Yes | In 1988, CBS conducted a U.S. nation-wide survey of voting preferences. for voting data, and Market research firm IRi hosts an anonymized dataset of customer shopping behavior at U.S. supermarkets (Bronnenberg et al., 2008). for supermarket data, and We study a dataset of N D 324 individuals from four geographic locations and focus on L D 13 928 locations on the genome. for population genetics. |
| Dataset Splits | No | The paper discusses cross-validation in the context of WAIC, stating 'WAIC D 1 N P n log .n/ C 2 log.n/: WAIC measures generalization error; it asymptotically equates to leave-one-one cross validation'. However, it does not provide specific train/validation/test dataset splits or cross-validation setup for its own experiments in the empirical study section. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions using 'Stan' and 'automatic differentiation variational inference (ADVI)' but does not provide specific version numbers for these or any other software libraries, compilers, or operating systems. |
| Experiment Setup | Yes | We default to S D 1000 in our experiments. and Set the prior on to match the mean and variance of the data (Robbins, 1964). Choose an uninformative prior on . Three mixtures make sense: two for the typical trends and one for the rest. and A 20-dimensional HPF model discovers intuitive trends. and Figure 5 shows how these individuals mix K D 3 ancestral populations. |