Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Entropic Gromov-Wasserstein Distances: Stability and Algorithms

Authors: Gabriel Rioux, Ziv Goldfeld, Kengo Kato

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide numerical experiments that validate our theory and empirically demonstrate the state-of-the-art empirical performance of our algorithm.
Researcher Affiliation	Academia	Gabriel Rioux EMAIL Center for Applied Mathematics Cornell University Ithaca, NY 14853, USA Ziv Goldfeld EMAIL School of Electrical and Computer Engineering Cornell University Ithaca, NY 14853, USA Kengo Kato EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14853, USA
Pseudocode	Yes	Algorithm 1 Fast gradient method with inexact oracle Algorithm 2 Adaptive gradient method with inexact oracle Algorithm 3 Sinkhorn Algorithm
Open Source Code	Yes	Implementations of Algorithms 1 and 2 are available at https://github.com/gabrielrioux/Entropic Gromov Wasserstein.
Open Datasets	Yes	We next assess the performance of our algorithms on real-world data from the Fashion-MNIST dataset (Xiao et al., 2017).
Dataset Splits	No	The paper does not explicitly mention traditional training/test/validation dataset splits. It describes using the Fashion-MNIST dataset for empirical validation of the EGW distance's properties (invariance to isometries) rather than for training a model in a supervised learning context with standard data splits. The images are processed (padded, rotated) and then compared.
Hardware Specification	Yes	All experiments were performed on a desktop computer with 16 GB of RAM and an Intel i5-10600k CPU using the Python programming language.
Software Dependencies	No	The paper mentions "Python programming language" but does not specify a version. It also refers to "the standard implementation of Sinkhorn’s algorithm from the Python Optimal Transport package (Flamary et al., 2021)", but a specific version number for this package is not provided.
Experiment Setup	Yes	First, ε is chosen as 1.05 * 16 * sqrt(M4(µ0)M4(µ1)) so as to guarantee convexity of Φ for each instance by Theorem 6 and M is set to sqrt(M2(µ0)M2(µ1)) + 10^-5. Figure 2 presents the average runtime of both algorithms in this setting with the stopping condition Gk F < 10^-6. To evaluate the performance of Algorithm 2 when convexity is unknown, we set ε to violate the condition of Theorem 6, but still be large enough so as to avoid numerical errors. If errors in running Algorithm 2 or the mirror descent methods occur, we double ε until all algorithms converge without errors. The initial point C0 for Algorithm 2 is taken as the matrix of all ones scaled by min{M, 1} * 10^-5. We consider two ways of choosing the smoothness parameter L, which effectively dictates the rate of convergence. The first is to set L equals to the theoretical upper bound from Theorem 6, i.e., L = 64 + 322ε^-1 * sqrt(M4(µ0)M4(µ1)) - 64.