reproducibilityindex.ai

The Wasserstein Transform

Authors: Facundo Memoli, Zane Smith, Zhengchao Wan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the performance of the Wasserstein transform method on different datasets as a preprocessing step prior to clustering and classiﬁcation tasks. 5. Implementation and Experiments
Researcher Affiliation	Academia	1Department of Mathematics, The Ohio State University, Ohio, USA 2Department of Computer Science and Engineering, University of Minnesota, Minnesota, USA 3Department of Computer Science and Engineering, The Ohio State University, Ohio, USA.
Pseudocode	No	The paper describes methods and processes in narrative text but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper references 'the Sinkhorn code from (Peyre, 2017)' and provides a link to a GitHub repository for that third-party code. However, it does not state that the authors are releasing their own implementation code for the Wasserstein Transform described in the paper.
Open Datasets	Yes	We compared the perfomance of our method with mean shift on the MNIST dataset (Le Cun et al., 1998) and on Grassmannian manifold data (Cetingul & Vidal, 2009).
Dataset Splits	No	For the MNIST dataset, the paper states '5K test images (using 5K training images)'. For the Grassmann manifold data, it mentions 'randomly split the set into 100 test matrices and 300 train matrices'. However, it does not explicitly specify a 'validation' dataset split.
Hardware Specification	No	The paper mentions running on 'a 24 core server' but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts.
Software Dependencies	No	The paper mentions 'Matlab s parallel computing toolbox' and 'implementation sinhorn_log from (Peyre, 2017)' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	In all of our experiments we used the implementation sinhorn_log from (Peyre, 2017) with options.niter = 2, epsilon = 0.05, and options.tau = 0. the ε parameter (deﬁning the neighborhood) was chosen to be the same and equal to 0.075 of the maximal value of the tangent distance matrix corresponding to the 10K points under consideration. The Gaussian kernel requires a standard deviation parameter which we set to 2/3 the ε-value of used for Wε.