Sliced Wasserstein Estimation with Control Variates
Authors: Khai Nguyen, Nhat Ho
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that the proposed control variate estimators yield considerably smaller variances than the conventional computational estimator of SW distance, using a finite number of projections, when comparing empirical distributions over images and point clouds. Moreover, we illustrate that the computation for control variates is negligible compared to the computation of the one-dimensional Wasserstein distance. Finally, we further demonstrate the favorable performance of the control variate approach in gradient flows between two point clouds, and in learning deep generative models on the CIFAR10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2015) datasets. |
| Researcher Affiliation | Academia | Khai Nguyen & Nhat Ho Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712, USA {khainb,minhnhat}@utexas.edu |
| Pseudocode | Yes | We refer to Algorithm 1 in Appendix D for the detailed algorithm for the SW. We refer to Algorithm 23 for the detailed algorithms for the control variate estimators in Appendix D. |
| Open Source Code | Yes | Code for the paper is published at https://github.com/khainb/CV-SW. |
| Open Datasets | Yes | CIFAR10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2015) datasets. MNIST (Le Cun et al., 1998) and Shape Net Core-55 dataset (Chang et al., 2015). |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide specific details on validation dataset splits (percentages, counts, or a clear strategy for forming a validation set) needed for reproduction. |
| Hardware Specification | Yes | For comparing empirical probability measures over images and point-cloud application, and the point-cloud gradient flows application, we use a Macbook Pro M1 for conducting experiments. For deep generative modeling, experiments are run on a single NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions specific software components like "Pot: Python optimal transport" and "Adam (Kingma & Ba, 2014) optimizer", but it does not provide specific version numbers for any of these, nor for any other programming languages or deep learning frameworks used (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | The number of training iterations is set to 100000 on CIFAR10 and 50000 in Celeb A. We update the generator Gϕ every 5 iterations and we update the feature function Tβ every iteration. The mini-batch size m is set to 128 in all datasets. We use the Adam (Kingma & Ba, 2014) optimizer with parameters (β1, β2) = (0, 0.9) for both Gϕ and Tβ with the learning rate 0.0002. |