Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning to Match via Inverse Optimal Transport

Authors: Ruilin Li, Xiaojing Ye, Haomin Zhou, Hongyuan Zha

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We back up our claims with numerical experiments on both synthetic data and real world data sets.
Researcher Affiliation	Academia	Ruilin Li EMAIL School of Mathematics School of Computational Science and Engineering Georgia Institute of Technology Atlanta, GA 30332, USA Xiaojing Ye EMAIL Department of Mathematics and Statistics Georgia State University Atlanta, GA 30302, USA Haomin Zhou EMAIL School of Mathematics Georgia Institute of Technology Atlanta, GA 30332, USA Hongyuan Zha EMAIL School of Computational Science and Engineering Georgia Institute of Technology Atlanta, GA 30332, USA
Pseudocode	Yes	Algorithm 1 Sinkhorn-Knopp Algorithm Algorithm 2 Solve RIOT
Open Source Code	No	The paper does not provide explicit access to source code for the methodology described. It mentions a third-party's code in a footnote but not their own.
Open Datasets	Yes	New York Taxi data set... https://www.kaggle.com/c/nyc-taxi-trip-duration/data Marriage data set... https://www.dhsdata.nl/site/users/login
Dataset Splits	Yes	We evaluate all models on Dutch Household Survey (DHS) data set from 2005 to 2014 excluding 2008 (due to data ﬁeld inconsistency). After data cleaning, the data set consists of 2475 pairs of couple. For each person we extract 11 features including education level, height, weight, health and 6 characteristic traits, namely irresponsible, accurate, ever-ready, disciplined, ordered, clumsy and detail-oriented. ... We train all models on training data set and measure error between predicted and test matching matrix by root mean square error (RMSE) and mean absolute error (MAE) using 5-fold cross-validation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For synthetic data set, we set λ = λu = λv = 1 and simulate m = 10 user proﬁles {ui} R10, n = 10 item proﬁles {vj} R8, two probability vectors µ0, ν0 R10, an interaction matrix A0 of size 10 8 and pick polynomial kernel k(x, y) = (γx T y + c0)d where γ = 0.05, c0 = 1, d = 2, hence C0ij = (0.05u T i Avj +1)2. For Cu, Cv, we randomly generate m and n points from N(0, 5I2) on plane and use their Euclidean distance matrix as Cu and Cv. In algorithm 2, we set the number of iterations of inner loop K = 20. ... To produce Figure 2, we set the number of iterations in outer loop L = 50, learning rate s = 10. For each N {50, 100, 200, 500, 1000, 2000} we run algorithm 2 and record Kullback-Leibler divergence between learned matching matrix πIOT, πRIOT and ground truth matching matrix π0.