Parallelizing MCMC with Random Partition Trees

Authors: Xiangyu Wang, Fangjian Guo, Katherine A. Heller, David B. Dunson

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical justification and extensive experiments illustrating empirical performance. In this section, we evaluate the empirical performance of PART1 and compare the two algorithms PART-KD and PART-ML to the following posterior aggregation algorithms.
Researcher Affiliation Academia Xiangyu Wang Dept. of Statistical Science Duke University xw56@stat.duke.edu Fangjian Guo Dept. of Computer Science Duke University guo@cs.duke.edu Katherine A. Heller Dept. of Statistical Science Duke University kheller@stat.duke.edu David B. Dunson Dept. of Statistical Science Duke University dunson@stat.duke.edu
Pseudocode Yes Algorithm 1 Partition tree algorithm; Algorithm 2 Density aggregation algorithm (drawing N samples from the aggregated posterior)
Open Source Code Yes MATLAB implementation available from https://github.com/richardkwo/random-tree-parallel-MCMC
Open Datasets Yes The Covertype dataset4 [17]... and the Mini Boo NE dataset5 [18, 19]... (Footnote 4: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html; Footnote 5: https://archive.ics.uci.edu/ml/machine-learning-databases/00199)
Dataset Splits No The paper states, 'The training set is randomly split into m = 40 subsets.' for synthetic data and 'we reserve 1/5 of the data as the test set' for real datasets. It does not explicitly mention a separate validation set for hyperparameter tuning or model selection in a standard train/validation/test split.
Hardware Specification No The paper does not provide any specific hardware details, such as GPU models, CPU types, or memory specifications, used for running the experiments.
Software Dependencies No The paper mentions 'MATLAB implementation' and refers to an 'R package' for the Weierstrass sampler, but it does not provide specific version numbers for these or any other software dependencies, which are required for a reproducible description.
Experiment Setup Yes For PART-KD/ML, one-stage aggregation (Algorithm 2) is used only for the toy examples... For other experiments, pairwise aggregation is used, which draws 50,000 samples for intermediate stages and halves δρ after each stage to refine the resolution... The random ensemble of PART consists of 40 trees. The results of PART are obtained with δρ = 0.001, δa = 0.0001 and 40 trees.