Smooth $p$-Wasserstein Distance: Structure, Empirical Approximation, and Statistical Applications

Authors: Sloan Nietert, Ziv Goldfeld, Kengo Kato

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present several numerical experiments supporting the theoretical results established in the previous sections. We focus on p = 2 so that we can use the MMD form for d(σ) 2 . Code is provided at https://github.com/ sbnietert/smooth-Wp. First, we examine W(σ) 2 (µ, ˆµn) directly, with computations feasible for small sample sizes using the stochastic averaged gradient (SAG) method as proposed by (Genevay et al., 2016) and implemented by (Hallin et al., 2020). In Figure 1 (left), we take µ = Unif([ 1, 1]d) and estimate E[W(σ) 2 (ˆµn, µ)] averaged over 10 trials, for varied d and σ.
Researcher Affiliation Academia 1Department of Computer Science, Cornell University, Ithaca, NY 2School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 3Department of Statistics and Data Science, Cornell University, Ithaca, NY
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is provided at https://github.com/ sbnietert/smooth-Wp.
Open Datasets No The paper uses synthetic distributions (e.g., µ = Unif([ -1, 1]d), µ = Ns) from which samples are drawn, but does not provide access information (link, DOI, citation) to a pre-existing publicly available dataset file.
Dataset Splits No The paper does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments.
Software Dependencies No The paper mentions methods and their implementations (e.g., stochastic averaged gradient method implemented by Hallin et al., 2020) but does not provide specific software dependencies with version numbers.
Experiment Setup Yes In Figure 1 (left), we take µ = Unif([ 1, 1]d) and estimate E[W(σ) 2 (ˆµn, µ)] averaged over 10 trials, for varied d and σ. ... Distributions are computed using kernel density estimation over 50 trials and estimating µ by ˆµ1000. ... Plotted in Figure 3 are n-scaled scatter plots of the estimation errors, with 40 trials for each σ and n pair. ... using 1000 and 200 bootstrap samples for d = 1 and d = 2, respectively. The probability of rejecting the null hypothesis for varied significance levels and sample sizes is estimated by repeating the tests over 100 and 200 draws of the original samples, for d = 1 and d = 2 respectively.