On the Convergence of Projected Bures-Wasserstein Gradient Descent under Euclidean Strong Convexity
Authors: Junyi Fan, Yuxuan Han, Zijian Liu, Jian-Feng Cai, Yang Wang, Zhengyuan Zhou
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate significant improvements in computational efficiency and convergence speed, underscoring the efficacy of our method in practical scenarios. |
| Researcher Affiliation | Collaboration | 1Department of Mathematics, Hong Kong University of Science and Technology 2Stern School of Business, New York University 3Department of Industrial Engineering and Decision Analytics, Hong Kong University of Science and Technology 4Arena Technologies. |
| Pseudocode | Yes | Algorithm 1 Projected BW Gradient Descent |
| Open Source Code | Yes | We have provided the Matlab code to reproduce our numerical results3 in https://github.com/Junyifannnn/Proj BWGD. |
| Open Datasets | No | The paper describes the generation of synthetic datasets for WDRO-MMSE and Constrained Barycenter problems, specifying parameters and distributions from which data elements were sampled (e.g., "elements uniformly sampled form [1, 5]", "sampled from standard normal distribution"). However, it does not provide access information (link, DOI, formal citation with authors/year) for a pre-existing publicly available dataset that was used directly. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test dataset splits (e.g., percentages or sample counts) for its experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Matlab code' and that it 'require MOSEK as the SDP solver'. While MOSEK is a specific solver, no version numbers are provided for either Matlab or MOSEK, which is required for a reproducible description of software dependencies. |
| Experiment Setup | Yes | For the WDRO-MMSE problem, we aim to solve the following problem: min Σx,Σw tr Σx Σx H (HΣx H + Σw 1HΣx s.t. Σx W(bΣx, ρx), Σw W(bΣw, ρw). where dimension n = 200, Wasserstein radii ρx = ρw = n, bΣx = UxΛx U T x , bΣw = UwΛw U T w with Ux, Uw the orthonormal eigenvector of Qx+QT x , Qw+QT w and Qx, Qw are sampled from standard normal distribution on Rn n, Λx and Λw are diagonal matrices with elements uniformly sampled form [1, 5] and [1, 2]. For constrained barycenter, we aim to solve: min Σ W(bΣ,ρ) i=1 βid2(Σ, Σi). where n = 30, ρ = 4 n, N = 50, PN i=1 βi = 1. And βi = αi PN i=1 αi with αi generated from αi U(0, 1), bΣ = UΛx U T , bΣi = UiΛi U T i with U, Ui generated the same way as in WDRO-MMSE and elements of Λx and Λw are from [1, 2] and [0.1, 100]. |