Computational Guarantees for Doubly Entropic Wasserstein Barycenters

Authors: Tomas Vaskevicius, Lénaïc Chizat

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Appendix F complements our theoretical results with numerical experiments. Our simulations experimentally confirm the necessity of damping when τ < λ/2. They also provide experimental support for the suggested damping factor in Algorithms 1 and 2. F Numerical Experiments In this section, we numerically validate our main theoretical results presented in Theorems 1 and 2. We empirically demonstrate the necessity of damping Sinkhorn iterations when τ < λ/2. In addition, we examine empirical convergence rates of Algorithms 1 and 2 (for a specific implementation of the approximate Sinkhorn oracle described below) in a simulation setup comprised of isotropic Gaussian measures.
Researcher Affiliation Academia Tomas Vaškeviˇcius and Lénaïc Chizat Institute of Mathematics École Polytechnique Fédérale de Lausanne (EPFL) Station Z CH-1015 Lausanne Switzerland
Pseudocode Yes Algorithm 1: Exact Damped Sinkhorn Scheme Input: regularization strengths λ, τ > 0, reference measure πref,number of iterations T and k marginal measures ν1, . . . , νk with positive weights w1, . . . , wk such that Pk j=1 wj = 1. Algorithm 2: Approximate Damped Sinkhorn Scheme Input: error tolerance parameter ε > 0, a function Approximate Sinkhorn Oracle satisfying properties listed in Definition 1, regularization strengths λ, τ > 0, reference measure πref,number of iterations T and k marginal measures ν1, . . . , νk with positive weights w1, . . . , wk such that Pk j=1 wj = 1.
Open Source Code Yes 1The code for reproducing the simulation results is available at https://github.com/ Tomas Vaskevicius/doubly-entropic-barycenters.
Open Datasets No The paper describes using Gaussian measures for simulation setup (e.g., νj = N(mj, σ2Id)) but does not provide concrete access information (link, DOI, specific citation with authors/year) for a publicly available dataset that was used for training or evaluation in the traditional sense of a machine learning dataset.
Dataset Splits No The paper describes simulation setups with Gaussian measures and discusses convergence, but it does not specify explicit training/validation/test dataset splits, percentages, or sample counts typically found in empirical studies for reproducibility.
Hardware Specification No The paper describes algorithms and numerical simulations but does not specify any hardware details (e.g., GPU/CPU models, memory) used to run these experiments.
Software Dependencies No The paper mentions that the code is available (e.g., on GitHub) but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries used in the implementation.
Experiment Setup Yes In all the simulations performed in this section, we let νj = N(mj, σ2Id) for some mj Rd and σ2 > 0. Implementation of Algorithm 1. When the marginals νj are Gaussian measures (not necessarily isotropic) and when ψj 0 = 0, then Sinkhorn updates admit a closed-form expression that result in quadratic Sinkhorn potentials ψj, φj and Gaussian measure µt. In particular, the iterates of Algorithm 1 can be written as ψj t (y) = 1 2y Aj ty y bj t +const, φj t(y) = 1 2x Cj t x x dj t +const, µt = N(et, Σt) (23) for some matrices At, Ct, Σt Rd d and vectors bt, dt, et Rd that can be computed using explicit recursive expressions (see, e.g., [34, 41]). Implementation of Algorithm 2. We implement approximate Sinkhorn oracle (Definition 1) as follows. Our updates maintain the property that the potentials ψj and φj are quadratic functions of the form described in Equation (23). At every iteration t, we replace each matrix Aj t (for j = 1, . . . , k) by performing the transformation Aj t 7 Aj t N j t , where N j t is a random positive-definite matrix with trace equal to ε. In our simulations, each N j t is obtained by drawing an independent d d matrix U with i.i.d. N(0, 1) entries and setting N j t = ε U U/trace(U U). The parameter ε controls the approximation error of the implemented approximate Sinkhorn oracle.