Parallel Streaming Wasserstein Barycenters

Authors: Matthew Staib, Sebastian Claici, Justin M. Solomon, Stefanie Jegelka

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate the practical effectiveness of our method, both in tracking moving distributions on a sphere, as well as in a large-scale Bayesian inference task. and We demonstrate the applicability of our method on two experiments, one synthetic and one performing a real inference task.
Researcher Affiliation Academia Matthew Staib MIT CSAIL mstaib@mit.edu, Sebastian Claici MIT CSAIL sclaici@mit.edu, Justin Solomon MIT CSAIL jsolomon@mit.edu, Stefanie Jegelka MIT CSAIL stefje@mit.edu
Pseudocode Yes Algorithm 1 Subgradient Ascent, Algorithm 2 Master Thread, Algorithm 3 Worker Thread
Open Source Code Yes We implemented our algorithm in C++ using MPI, and our code is posted at github.com/mstaib/stochastic-barycenter-code.
Open Datasets Yes We run logistic regression on the UCI skin segmentation dataset [8]. The 245057 datapoints are colors represented in R3, each with a binary label determing whether that color is a skin color. and [8] Rajen Bhatt and Abhinav Dhall. Skin segmentation dataset. UCI Machine Learning Repository.
Dataset Splits No The paper mentions splitting the dataset into 127 subsets for distributed processing, but it does not provide specific percentages or counts for training, validation, or test splits. It states 'Full experiment details are given in Appendix D.', but Appendix D is not provided in the current context.
Hardware Specification No The paper mentions using 'an InfiniBand cluster' and 'no individual 16 thread node used more than 2GB of memory', but it does not provide specific hardware details such as GPU/CPU models or comprehensive processor specifications.
Software Dependencies No The paper states 'We implemented our algorithm in C++ using MPI' and 'compute the barycenter LP as in [47] via Mosek [4]', but it does not specify version numbers for MPI, Mosek, or C++ compilers.
Experiment Setup Yes After 317 seconds, or about 10000 iterations per subset posterior, our algorithm has produced a barycenter on n 10^4 support points... and though tuning the stepsize becomes more challenging. and For n 10^4, over a wide range of stepsizes we can in seconds approximate the full posterior better than is possible with the LP as seen in Figure 2 by terminating early.