The Wasserstein Transform
Authors: Facundo Memoli, Zane Smith, Zhengchao Wan
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the performance of the Wasserstein transform method on different datasets as a preprocessing step prior to clustering and classification tasks. 5. Implementation and Experiments |
| Researcher Affiliation | Academia | 1Department of Mathematics, The Ohio State University, Ohio, USA 2Department of Computer Science and Engineering, University of Minnesota, Minnesota, USA 3Department of Computer Science and Engineering, The Ohio State University, Ohio, USA. |
| Pseudocode | No | The paper describes methods and processes in narrative text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper references 'the Sinkhorn code from (Peyre, 2017)' and provides a link to a GitHub repository for that third-party code. However, it does not state that the authors are releasing their own implementation code for the Wasserstein Transform described in the paper. |
| Open Datasets | Yes | We compared the perfomance of our method with mean shift on the MNIST dataset (Le Cun et al., 1998) and on Grassmannian manifold data (Cetingul & Vidal, 2009). |
| Dataset Splits | No | For the MNIST dataset, the paper states '5K test images (using 5K training images)'. For the Grassmann manifold data, it mentions 'randomly split the set into 100 test matrices and 300 train matrices'. However, it does not explicitly specify a 'validation' dataset split. |
| Hardware Specification | No | The paper mentions running on 'a 24 core server' but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts. |
| Software Dependencies | No | The paper mentions 'Matlab s parallel computing toolbox' and 'implementation sinhorn_log from (Peyre, 2017)' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | In all of our experiments we used the implementation sinhorn_log from (Peyre, 2017) with options.niter = 2, epsilon = 0.05, and options.tau = 0. the ε parameter (defining the neighborhood) was chosen to be the same and equal to 0.075 of the maximal value of the tangent distance matrix corresponding to the 10K points under consideration. The Gaussian kernel requires a standard deviation parameter which we set to 2/3 the ε-value of used for Wε. |