reproducibilityindex.ai

Sliced Wasserstein with Random-Path Projecting Directions

Authors: Khai Nguyen, Shujian Zhang, Tam Le, Nhat Ho

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we compare the proposed RPSW and IWRPSW to the existing SW variants such as SW, Max SW, DSW, and EBSW in gradient flow in Section 4.1 and training denoising diffusion models in 4.3.We show both the qualitative visualization and quantitative comparison (in Wasserstein-2 distance (Flamary et al., 2021)) in Figure 1.
Researcher Affiliation	Academia	1Department of Statistics and Data Sciences, University of Texas at Austin, USA 2Department of Advanced Data Science, The Institute of Statistical Mathematics (ISM), Japan 3RIKEN AIP, Japan.
Pseudocode	Yes	Algorithm 1 Computational algorithm of RPSW
Open Source Code	Yes	1Code for this paper is published at https://github. com/khainb/RPSW.
Open Datasets	Yes	Utilizing MNIST dataset (Le Cun et al., 1998), we select images of digit 1 to construct the source distribution and images of digit 0 to construct the target distribution.Following the setting in (Xiao et al., 2021) for diffusion models on CIFAR10 (Krizhevsky et al., 2009) with N = 1800 epochs.
Dataset Splits	No	The paper mentions using standard datasets like MNIST and CIFAR10 but does not explicitly provide specific training, validation, and test split percentages or sample counts for reproducibility within the text.
Hardware Specification	Yes	For the gradient flow experiments, we use a HP Omen 25L desktop for conducting experiments. For diffusion model experiments, we use a single NVIDIA A100 GPU.
Software Dependencies	No	The paper does not explicitly list specific software components with their version numbers required for reproduction (e.g., Python 3.x, PyTorch x.x, CUDA x.x).
Experiment Setup	Yes	We set the total number of projections for SW variants to 10. For DSW, RPSW, and IWRPSW, we set the concentration parameter κ of the PS distribution as a dynamic quantity i.e., κ(t) = (κ0 1) N t 1 / (N 1) + 10 / N with N = 300 and κ0 {100, 50}.We set L = 104 for SW, DSW, EBSW, RPSW, and IWRPSW. We set T {2, 5, 10} for DSW and T {100, 1000} for Max-SW.We set initial learning rate for discriminator to 10 4, initial learning rate for generator to 1.6 10 4, Adam optimizer with parameters (0.5, 0.9), EMA to 0.9999, batch-size to 256. For the learning rate scheduler, we use cosine learning rate decay.