Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient Discrete Multi Marginal Optimal Transport Regularization

Authors: Ronak Mehta, Jeffery Kline, Vishnu Suresh Lokhande, Glenn Fung, Vikas Singh

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide technical details about this new regularization term and its properties, and we present experimental demonstrations of faster runtimes when compared to standard Wasserstein-style methods. Finally, on a range of experiments designed to assess effectiveness at enforcing fairness, we demonstrate our method compares well with alternatives.
Researcher Affiliation	Collaboration	Ronak Mehta UW-Madison Jeffery Kline Affirm Vishnu Suresh Lokhande UW-Madison Glenn Fung Liberty Mutual Vikas Singh UW-Madison
Pseudocode	Yes	Algorithm 1 Our proposed method: d Dimensional Earch Mover s Distance (DEMD)
Open Source Code	Yes	Code is available at https://github.com/ronakrm/demd.
Open Datasets	Yes	The datasets Adult, Communities and Crime, and German datasets are all available under Creative Commons Attribution 4.0 International (CC BY 4.0) licenses via the UCI Machine Learning Dataset Repository https://archive.ics.uci.edu/ml/index.php. The Celeb A dataset http://mmlab.ie.cuhk.edu.hk/projects/Celeb A.html is available for noncommercial purposes. See the website for more details. ACS Data. The American Census Survey (ACS) has recently made available a large set of demographic data. The original UCI Adult dataset Dua & Graff (2017) was curated from this data, however recent work by Ding et al. (2021) has identified temporal shifts in demographic data, and recommends using a more recent collection as a baseline when evaluating biases and adjusting for fairness. Part of their contribution includes APIs to directly interface with the data provided by the ACS, and the ability to identify and construct similar problems associated with the original UCI-provided dataset, albeit with updated data. Data for the income prediction task was downloaded from 2018, localized to Louisiana. Race is the provided group label, which we wish to be agnostic towards, over some measure of our output. Data was accessed using the folktables codebase https://github.com/zykls/folktables with MIT License. The US Census data accessed is available for use so long as it is not used in combination with other data to identify any particular respondent to a Census Bureau survey. See https://www.census.gov/data/developers/about/terms-of-service.html for more details.
Dataset Splits	Yes	The hyper-parameter selection has been done on a validation split obtained from the training dataset.
Hardware Specification	Yes	Experiments were conducted using Num Py and Py Torch on a Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with an Nvidia Titan Xp GPU.
Software Dependencies	No	The paper mentions several software packages such as Num Py, Py Torch, Sci Py, and the Python Optimal Transport (POT) library, but does not specify their version numbers. For example: "Experiments were conducted using Num Py and Py Torch" and "Computation of the EMD is readily available, as in the Python Optimal Transport (POT) library (Flamary et al., 2021)."
Experiment Setup	Yes	All experiments related to fairness use a three-layer fully-connected neural network classifier with a hidden layer size of 100. Numbers reported in Table 1 in the main paper are means and standard deviations over three replicate runs with different seeds. Hyperparameters were selected as described in the main paper, by taking the best result for each dataset over the parameter range λ [1.0, 0.1, 10, 0.01, 100, 0.001].