Bayesian Pseudocoresets
Authors: Dionysis Manousakas, Zuheng Xu, Cecilia Mascolo, Trevor Campbell
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Real and synthetic experiments on high-dimensional data demonstrate that Bayesian pseudocoresets achieve significant improvements in posterior approximation error compared to traditional coresets, and that pseudocoresets provide privacy without a significant loss in approximation quality. |
| Researcher Affiliation | Academia | Dionysis Manousakas Department of Computer Science & Technology University of Cambridge dm754@cam.ac.uk Zuheng Xu Department of Statistics University of British Columbia zuheng.xu@stat.ubc.ca Cecilia Mascolo Department of Computer Science & Technology University of Cambridge cm542@cam.ac.uk Trevor Campbell Department of Statistics University of British Columbia trevor@stat.ubc.ca |
| Pseudocode | Yes | Algorithm 1 Pseudocoreset Variational Inference |
| Open Source Code | Yes | Code for the presented experiments is available at https://github.com/trevorcampbell/pseudocoresets-experiments. |
| Open Datasets | No | No concrete access information (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets with proper attribution) for publicly available datasets was found. The paper mentions TRANSACTIONS, CHEMREACT100 and MUSIC datasets, but without direct links or formal citations to access them. |
| Dataset Splits | No | No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning was found. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were found. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment were found. |
| Experiment Setup | Yes | For PSVI and Sparse VI we use minibatch size B = 200, number of gradient updates T = 500, and learning rate schedules γt = αt 1. For TRANSACTIONS, CHEMREACT100 and MUSIC, α is respectively set to 0.1, 0.1, 10 for Sparse VI, and 1, 10, 10 for PSVI. ... we use a subsampling ratio q = 2 10 3. At each iteration we adapt the clipping norm value C to the median norm... and use noise level σ = 5. Our hyperparameters choice implies privacy parameters ε = 0.2 and δ = 1/N for each of the datasets. |