Fast Computation of Wasserstein Barycenters
Authors: Marco Cuturi, Arnaud Doucet
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use these algorithms to visualize a large family of images and to solve a constrained clustering problem. |
| Researcher Affiliation | Academia | Marco Cuturi MCUTURI@I.KYOTO-U.AC.JP Graduate School of Informatics, Kyoto University Arnaud Doucet DOUCET@STAT.OXFORD.AC.UK Department of Statistics, University of Oxford |
| Pseudocode | Yes | Algorithm 1 Wasserstein Barycenter in P(X, Θ), Algorithm 2 2-Wasserstein Barycenter in Pk(Rd, Θ), Algorithm 3 Smoothed Primal Tλ and Dual α |
| Open Source Code | No | The paper does not provide any links to its own source code or explicitly state that the code is being released. |
| Open Datasets | Yes | We use 50.000 images of the MNIST database, with approximately 5.000 images for each digit from 0 to 9. |
| Dataset Splits | No | The paper describes using the MNIST database and US census data, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts for each split) for model training or evaluation in the traditional machine learning sense. |
| Hardware Specification | Yes | Using a Quadro K5000 GPU with close to 1500 cores, the computation of a single barycenter takes about 2 hours to reach 100 iterations. [...] On a single CPU core, these computations require 12.5 seconds for the constrained case, using Sinkhorn s approximation, and 1.55 seconds for the regular k-means algorithm. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., specific programming languages, libraries, or solvers with their versions). |
| Experiment Setup | Yes | λ is set to 60/median(M), where M is the squared-Euclidean distance matrix between all 2,500 pixels in the grid. [...] display intermediate barycenter solutions for each of these 10 datasets of images for t = 1, 10, 60 gradient iterations. |