Efficient Optimal Transport Algorithm by Accelerated Gradient Descent
Authors: Dongsheng An, Na Lei, Xiaoyin Xu, Xianfeng Gu10119-10128
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretically, the computational complexity of the proposed method is lower than current estimation of the Sinkhorn algorithm in terms of the precision. Experimentally, compared with the Sinkhorn algorithm, our results demonstrate that the proposed method achieves faster convergence and better accuracy with the same parameter. |
| Researcher Affiliation | Academia | 1Stony Brook University, NY, USA 2 Dalian University of Technology, Liaoning, China 3Harvard Medical School, MA, USA |
| Pseudocode | Yes | Algorithm 1: Accelerated gradient descent for OT |
| Open Source Code | No | The paper mentions that "All of the codes were written in MATLAB with GPU acceleration" but does not provide a specific repository link or an explicit statement about the code being publicly available. |
| Open Datasets | Yes | For the experiments with ED and SED, the distributions come from the MNIST dataset (Le Cun and Cortes 1998), as illustrated in the Cost Matrix. |
| Dataset Splits | No | The paper describes how the data is generated or chosen for experiments (e.g., "randomly generated from the Gaussian distribution", "randomly choose one pair of images from the MNIST dataset") but does not provide specific details on training, validation, or test splits (e.g., percentages, sample counts, or predefined standard splits for reproducibility). |
| Hardware Specification | Yes | The experiments are also conducted on a Windows laptop with Intel Core i7-7700HQ CPU, 16 GB memory and NVIDIA GTX 1060Ti GPU. |
| Software Dependencies | No | The paper states: "All of the codes were written in MATLAB with GPU acceleration". However, it does not provide specific version numbers for MATLAB or any other software libraries/dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | In our experiments, to get λ as small as possible, based on the Property 1 of the Eqn. (6), we set the median of the cost matrix C equal to zero, so that the full range of the exponential of the floating-point numbers can be used, instead of only the negative part1. Thus we set C = C Cmax+Cmin 2 and call it the translation trick. If the range of C is denoted as R, then the accuracy parameter is set to be λ = R T , where T is a positive constant. For the FISTA algorithm, the ideal step size should be ηt = 1 σmax , where σmax is the maximal eigenvalue of the Hessian matrix 2 Eλ(ψ) in Eqn. (14). By Nesterov smoothing, we know σmax 1 λ, so we set the step length ηt = ηλ, where η is a constant2. In practice we use (T, η) as control parameters instead of (λ, ηt). |