Black-box coreset variational inference

Authors: Dionysis Manousakas, Hippolyt Ritter, Theofanis Karaletsos

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our techniques to supervised learning problems, and compare them with existing approaches in the literature for data summarization and inference. 4 Experiments In this section we evaluate the performance of our inference framework in intractable models and compare against standard variational inference methods and earlier Bayesian coreset constructions, as well as black-box extensions of existing variational coresets that rely on our generalized ELBO Eq. (9). 4.1 Logistic regression First, we perform inference on logistic regression fitting 3 publicly available binary classification datasets [17, 53] with sizes ranging between 10k and 150k datapoints, and 10 and 128 dimensions. 4.2 Bayesian Neural Networks In this section we present inference results on Bayesian neural networks (BNNs), a model class that previous work on Bayesian coresets did not consider due to the absence of a black-box variational estimator.
Researcher Affiliation Industry Dionysis Manousakas Meta dm754@cantab.ac.uk Hippolyt Ritter Meta hippolyt@meta.com Theofanis Karaletsos Insitro theofanis@karaletsos.com
Pseudocode Yes We provide pseudo-code for the respective methods in Algs. 3 5 in Supplement B.
Open Source Code Yes We make code available at www.github.com/facebookresearch/Blackbox-Coresets-VI.
Open Datasets Yes First, we perform inference on logistic regression fitting 3 publicly available binary classification datasets [17, 53] with sizes ranging between 10k and 150k datapoints, and 10 and 128 dimensions. In this part we assess the approximation quality of large-scale dataset compression for BNNs via coresets. We compare the predictive performance of black-box PSVI against standard mean-field VI, random coresets and frequentist methods relying on learnable synthetic data, namely dataset distillation w/ and w/o learnable soft labels [49, 54], and data condensation [57]. We compare the predictive performance of black-box PSVI against standard mean-field VI, random coresets and frequentist methods relying on learnable synthetic data, namely dataset distillation w/ and w/o learnable soft labels [49, 54], and data condensation [57]. MNIST In this part we assess the approximation quality of large-scale dataset compression for BNNs via coresets.
Dataset Splits No The paper mentions 'full training dataset' and 'test set' but does not explicitly provide details about specific training/validation/test splits, such as percentages or sample counts for each subset, nor does it specify the validation set details.
Hardware Specification No The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory) to run its experiments.
Software Dependencies No The paper mentions software like PyTorch and Pyro in its references but does not provide specific version numbers for these or other key software dependencies used in their experimental setup.
Experiment Setup Yes We posit normal priors N(0, I) and consider mean-field variational approximations with diagonal covariance. We generate two synthetic datasets with size 1k datapoints, corresponding to noisy samples from a half-moon shaped 2-class dataset, and a mixture of 4 unimodal clusters of data each belonging to a different class [24], and use a layer with 20 and 50 units respectively. To evaluate the representation ability of the pseudocoresets we consider two initialization schemes: we initialise the pseudo locations on a random subset equally split across categories, and a random initialization using a Gaussian centered on the means of the empirical distributions.