Bayesian Pseudocoresets

Authors: Dionysis Manousakas, Zuheng Xu, Cecilia Mascolo, Trevor Campbell

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Real and synthetic experiments on high-dimensional data demonstrate that Bayesian pseudocoresets achieve significant improvements in posterior approximation error compared to traditional coresets, and that pseudocoresets provide privacy without a significant loss in approximation quality.
Researcher Affiliation Academia Dionysis Manousakas Department of Computer Science & Technology University of Cambridge dm754@cam.ac.uk Zuheng Xu Department of Statistics University of British Columbia zuheng.xu@stat.ubc.ca Cecilia Mascolo Department of Computer Science & Technology University of Cambridge cm542@cam.ac.uk Trevor Campbell Department of Statistics University of British Columbia trevor@stat.ubc.ca
Pseudocode Yes Algorithm 1 Pseudocoreset Variational Inference
Open Source Code Yes Code for the presented experiments is available at https://github.com/trevorcampbell/pseudocoresets-experiments.
Open Datasets No No concrete access information (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets with proper attribution) for publicly available datasets was found. The paper mentions TRANSACTIONS, CHEMREACT100 and MUSIC datasets, but without direct links or formal citations to access them.
Dataset Splits No No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning was found.
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were found.
Software Dependencies No No specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment were found.
Experiment Setup Yes For PSVI and Sparse VI we use minibatch size B = 200, number of gradient updates T = 500, and learning rate schedules γt = αt 1. For TRANSACTIONS, CHEMREACT100 and MUSIC, α is respectively set to 0.1, 0.1, 10 for Sparse VI, and 1, 10, 10 for PSVI. ... we use a subsampling ratio q = 2 10 3. At each iteration we adapt the clipping norm value C to the median norm... and use noise level σ = 5. Our hyperparameters choice implies privacy parameters ε = 0.2 and δ = 1/N for each of the datasets.