Efficient Optimal Transport Algorithm by Accelerated Gradient Descent

Authors: Dongsheng An, Na Lei, Xiaoyin Xu, Xianfeng Gu10119-10128

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretically, the computational complexity of the proposed method is lower than current estimation of the Sinkhorn algorithm in terms of the precision. Experimentally, compared with the Sinkhorn algorithm, our results demonstrate that the proposed method achieves faster convergence and better accuracy with the same parameter.
Researcher Affiliation Academia 1Stony Brook University, NY, USA 2 Dalian University of Technology, Liaoning, China 3Harvard Medical School, MA, USA
Pseudocode Yes Algorithm 1: Accelerated gradient descent for OT
Open Source Code No The paper mentions that "All of the codes were written in MATLAB with GPU acceleration" but does not provide a specific repository link or an explicit statement about the code being publicly available.
Open Datasets Yes For the experiments with ED and SED, the distributions come from the MNIST dataset (Le Cun and Cortes 1998), as illustrated in the Cost Matrix.
Dataset Splits No The paper describes how the data is generated or chosen for experiments (e.g., "randomly generated from the Gaussian distribution", "randomly choose one pair of images from the MNIST dataset") but does not provide specific details on training, validation, or test splits (e.g., percentages, sample counts, or predefined standard splits for reproducibility).
Hardware Specification Yes The experiments are also conducted on a Windows laptop with Intel Core i7-7700HQ CPU, 16 GB memory and NVIDIA GTX 1060Ti GPU.
Software Dependencies No The paper states: "All of the codes were written in MATLAB with GPU acceleration". However, it does not provide specific version numbers for MATLAB or any other software libraries/dependencies, which is required for reproducibility.
Experiment Setup Yes In our experiments, to get λ as small as possible, based on the Property 1 of the Eqn. (6), we set the median of the cost matrix C equal to zero, so that the full range of the exponential of the floating-point numbers can be used, instead of only the negative part1. Thus we set C = C Cmax+Cmin 2 and call it the translation trick. If the range of C is denoted as R, then the accuracy parameter is set to be λ = R T , where T is a positive constant. For the FISTA algorithm, the ideal step size should be ηt = 1 σmax , where σmax is the maximal eigenvalue of the Hessian matrix 2 Eλ(ψ) in Eqn. (14). By Nesterov smoothing, we know σmax 1 λ, so we set the step length ηt = ηλ, where η is a constant2. In practice we use (T, η) as control parameters instead of (λ, ηt).