Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent
Authors: Jason Altschuler, Sinho Chewi, Patrik R Gerber, Austin Stromme
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. [...] In Section 3, we show that for the Wasserstein barycenter problem, Riemannian GD enjoys dimension-free convergence rates (Theorem 2). We make several comments to contextualize this result. First, our result eliminates the exponential dimension dependence of state-of-the-art convergence rates [Che+20], which aligns with the empirical performance of this algorithm (see Figure 1). |
| Researcher Affiliation | Academia | Jason M. Altschuler MIT EMAIL Sinho Chewi MIT EMAIL Patrik Gerber MIT EMAIL Austin J. Stromme MIT EMAIL |
| Pseudocode | Yes | Algorithm 1 GD for Barycenters, Algorithm 2 SGD for Barycenters, Algorithm 3 GD for Regularized Barycenters, Algorithm 4 Smoothed GD for Median |
| Open Source Code | Yes | Code reproducing all experiments in this paper can be found at https://github.com/Patrik Gerber/Bures-Barycenters. |
| Open Datasets | No | The paper uses synthetically generated data based on specified parameters (e.g., 'n = 50 covariance matrices of dimension d = 50, each with condition number κ = 1000. Their eigenspaces are independent Haar distributed, and their eigenvalues are equally spaced in the interval [λmin, λmax] = [0.03, 30]'), but does not refer to or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes generating synthetic data and running experiments (e.g., 'We run 50 experiments and plot the average accuracy'), but it does not provide specific details on how these generated data are partitioned into training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory specifications, or cloud instance types). |
| Software Dependencies | No | The paper mentions external solvers like SCS [ODo+16; ODo+19] and MOSEK [MOS21] for comparison, but it does not specify version numbers for the software dependencies used in their own implementation of the algorithms. |
| Experiment Setup | Yes | In Figure 2 we compare Riemannian and Euclidean GD on a dataset consisting of n = 50 covariance matrices of dimension d = 50, each with condition number κ = 1000. Their eigenspaces are independent Haar distributed, and their eigenvalues are equally spaced in the interval [λmin, λmax] = [0.03, 30]. [...] In our experiments, Riemannian SGD was competitive on a wide range of problems with η = 1/t. [...] In Figure 5 we plot the suboptimality gap for the unregularized objective F = R W2( , Σ) d P as we optimize Fε using Algorithm 4 for various ε. The covariance matrices are generated as in Figure 2, with n = d = 30 and [λmin, λmax] = [0.01, 10]. |