Accelerating Sinkhorn algorithm with sparse Newton iterations
Authors: Xun Tang, Michael Shavlovsky, Holakou Rahmanian, Elisa Tardini, Kiran Koshy Thekumparampil, Tesi Xiao, Lexing Ying
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct numerical experiments to compare the original Sinkhorn and SNS algorithms. We use the following settings: We obtained the runtime data on a 2021 Macbook Pro laptop with an Apple M1 Pro chip. The linear system solving is done through the conjugate gradient step as mentioned in Algorithm 1. To set a challenging case, we use an entropic regularization with η = 1200 throughout the experiments. We refer the reader to Appendix F for the performance of SNS under different entropy parameter η. |
| Researcher Affiliation | Collaboration | Xun Tang , Michael Shavlovsky , Holakou Rahmanian , Elisa Tardini , Kiran Koshy Thekumparampil , Tesi Xiao , Lexing Ying Stanford University Amazon {xutang,lexing}@stanford.edu {shavlov, holakou, ettardin, kkt, tesixiao}@amazon.com |
| Pseudocode | Yes | Algorithm 1 Sinkhorn-Newton-Sparse (SNS) |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that the code for their method is available. |
| Open Datasets | Yes | In the second numerical experiment, similar to the experiment setting in Cuturi (2013), we consider the more practical case of optimal transport on the MNIST dataset. In particular, two images are respectively converted to a vector of intensities on the 28 × 28 pixel grid, which are then normalized to sum to 1. |
| Dataset Splits | No | The paper mentions the use of the MNIST dataset but does not specify how the data was split into training, validation, and test sets, or if standard splits were used. The experiments appear to involve calculating optimal transport between distributions/images rather than training a model with explicit splits. |
| Hardware Specification | Yes | We obtained the runtime data on a 2021 Macbook Pro laptop with an Apple M1 Pro chip. |
| Software Dependencies | No | The paper mentions the use of the conjugate gradient step for linear system solving but does not specify any software names with version numbers for libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | To set a challenging case, we use an entropic regularization with η = 1200 throughout the experiments. We run Algorithm 1 with N1 = 20 and a target sparsity of λ = 2/n. (...) As the Sinkhorn algorithm converges quite slowly, we pick N1 = 700 before switching to the Newton stage. Choosing a target sparsity of λ = 15/n is sufficient for convergence to the ground truth. |