reproducibilityindex.ai

Bayesian Pseudocoresets

Authors: Dionysis Manousakas, Zuheng Xu, Cecilia Mascolo, Trevor Campbell

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Real and synthetic experiments on high-dimensional data demonstrate that Bayesian pseudocoresets achieve significant improvements in posterior approximation error compared to traditional coresets, and that pseudocoresets provide privacy without a significant loss in approximation quality.
Researcher Affiliation	Academia	Dionysis Manousakas Department of Computer Science & Technology University of Cambridge dm754@cam.ac.uk Zuheng Xu Department of Statistics University of British Columbia zuheng.xu@stat.ubc.ca Cecilia Mascolo Department of Computer Science & Technology University of Cambridge cm542@cam.ac.uk Trevor Campbell Department of Statistics University of British Columbia trevor@stat.ubc.ca
Pseudocode	Yes	Algorithm 1 Pseudocoreset Variational Inference
Open Source Code	Yes	Code for the presented experiments is available at https://github.com/trevorcampbell/pseudocoresets-experiments.
Open Datasets	No	No concrete access information (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets with proper attribution) for publicly available datasets was found. The paper mentions TRANSACTIONS, CHEMREACT100 and MUSIC datasets, but without direct links or formal citations to access them.
Dataset Splits	No	No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning was found.
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were found.
Software Dependencies	No	No specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment were found.
Experiment Setup	Yes	For PSVI and Sparse VI we use minibatch size B = 200, number of gradient updates T = 500, and learning rate schedules γt = αt 1. For TRANSACTIONS, CHEMREACT100 and MUSIC, α is respectively set to 0.1, 0.1, 10 for Sparse VI, and 1, 10, 10 for PSVI. ... we use a subsampling ratio q = 2 10 3. At each iteration we adapt the clipping norm value C to the median norm... and use noise level σ = 5. Our hyperparameters choice implies privacy parameters ε = 0.2 and δ = 1/N for each of the datasets.