Markovian Sliced Wasserstein Distances: Beyond Independent Projections
Authors: Khai Nguyen, Tongzheng Ren, Nhat Ho
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we compare MSW distances with previous SW variants in various applications such as gradient flows, color transfer, and deep generative modeling to demonstrate the favorable performance of the MSW1. |
| Researcher Affiliation | Academia | Khai Nguyen Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 khainb@utexas.edu Tongzheng Ren Department of Computer Science The University of Texas at Austin Austin, TX 78712 tongzheng@utexas.edu Nhat Ho Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 minhnhat@utexas.edu |
| Pseudocode | Yes | Algorithm 1 Max sliced Wasserstein distance |
| Open Source Code | Yes | Code for this paper is published at https://github.com/UT-Austin-Data-Science-Group/MSW. |
| Open Datasets | Yes | We compare MSW with previous baselines including SW, Max-SW, K-SW, and Max-K-SW on benchmark datasets: CIFAR10 (image size 32x32) [29], and Celeb A [36] (image size 64x64). |
| Dataset Splits | No | The paper mentions training, but does not provide explicit training/validation/test splits. It uses 'benchmark datasets' and 'standard image datasets' which implies standard splits, but these are not explicitly stated. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions using Adam [25] as an optimizer, but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | In the experiments, we utilize the Euler scheme with 300 timesteps and the step size is 10^-3 to move the empirical distribution... For Max-SW, Max-K-SW, i MSW, and vi MSW, we use the learning rate parameter for projecting directions η = 0.1. ... The number of training iterations is set to 50000. We update the generator Gϕ each 5 iterations while we update the feature function Fγ every iteration. The mini-batch size m is set 128 in all datasets. The learning rate for Gϕ and Fγ is 0.0002 and the optimizer is Adam [25] with parameters (β1, β2) = (0, 0.9). We use the order p = 2 for all sliced Wasserstein variants. |