Black-box coreset variational inference
Authors: Dionysis Manousakas, Hippolyt Ritter, Theofanis Karaletsos
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our techniques to supervised learning problems, and compare them with existing approaches in the literature for data summarization and inference. 4 Experiments In this section we evaluate the performance of our inference framework in intractable models and compare against standard variational inference methods and earlier Bayesian coreset constructions, as well as black-box extensions of existing variational coresets that rely on our generalized ELBO Eq. (9). 4.1 Logistic regression First, we perform inference on logistic regression fitting 3 publicly available binary classification datasets [17, 53] with sizes ranging between 10k and 150k datapoints, and 10 and 128 dimensions. 4.2 Bayesian Neural Networks In this section we present inference results on Bayesian neural networks (BNNs), a model class that previous work on Bayesian coresets did not consider due to the absence of a black-box variational estimator. |
| Researcher Affiliation | Industry | Dionysis Manousakas Meta dm754@cantab.ac.uk Hippolyt Ritter Meta hippolyt@meta.com Theofanis Karaletsos Insitro theofanis@karaletsos.com |
| Pseudocode | Yes | We provide pseudo-code for the respective methods in Algs. 3 5 in Supplement B. |
| Open Source Code | Yes | We make code available at www.github.com/facebookresearch/Blackbox-Coresets-VI. |
| Open Datasets | Yes | First, we perform inference on logistic regression fitting 3 publicly available binary classification datasets [17, 53] with sizes ranging between 10k and 150k datapoints, and 10 and 128 dimensions. In this part we assess the approximation quality of large-scale dataset compression for BNNs via coresets. We compare the predictive performance of black-box PSVI against standard mean-field VI, random coresets and frequentist methods relying on learnable synthetic data, namely dataset distillation w/ and w/o learnable soft labels [49, 54], and data condensation [57]. We compare the predictive performance of black-box PSVI against standard mean-field VI, random coresets and frequentist methods relying on learnable synthetic data, namely dataset distillation w/ and w/o learnable soft labels [49, 54], and data condensation [57]. MNIST In this part we assess the approximation quality of large-scale dataset compression for BNNs via coresets. |
| Dataset Splits | No | The paper mentions 'full training dataset' and 'test set' but does not explicitly provide details about specific training/validation/test splits, such as percentages or sample counts for each subset, nor does it specify the validation set details. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory) to run its experiments. |
| Software Dependencies | No | The paper mentions software like PyTorch and Pyro in its references but does not provide specific version numbers for these or other key software dependencies used in their experimental setup. |
| Experiment Setup | Yes | We posit normal priors N(0, I) and consider mean-field variational approximations with diagonal covariance. We generate two synthetic datasets with size 1k datapoints, corresponding to noisy samples from a half-moon shaped 2-class dataset, and a mixture of 4 unimodal clusters of data each belonging to a different class [24], and use a layer with 20 and 50 units respectively. To evaluate the representation ability of the pseudocoresets we consider two initialization schemes: we initialise the pseudo locations on a random subset equally split across categories, and a random initialization using a Gaussian centered on the means of the empirical distributions. |