Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient Algorithms for Robust and Partial Semi-Discrete Optimal Transport

Authors: Pankaj Agarwal, Sharath Raghvendra, Pouyan Shirzadian, Keegan Yao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We developed a preliminary implementation of this algorithm and experimentally demonstrate that (i) the algorithm executes within the theoretically predicted number of iterations, and (ii) both SDOPT and SDROT predominantly transport inlier mass. See Section 4.3. ... We implemented a prototype of our algorithm in Python to empirically verify its performance. ... Table 1: The results of our experiments empirically verifying our theoretical claims.
Researcher Affiliation Academia Pankaj K. Agarwal Department of Computer Science Duke University; Sharath Raghvendra Department of Computer Science North Carolina State University; Pouyan Shirzadian Department of Computer Science Virginia Tech; Keegan Yao Department of Computer Science Duke University
Pseudocode Yes Our algorithm works in O(log( /ε)) scales, where is the diameter of A B. Each scale is associated with a parameter δ > 0 and computes a δ-feasible λ-robust transport plan. Our algorithm maintains a set of weights y( ) for B and uses a few subroutines that are described in the appendix. At the beginning of the first scale, set δ = and y(b) = 0 for all b B. Execute the following steps while δ > ε/2. ... E.2 SEARCHANDCONSOLIDATE Procedure ... E.3 REDUCEWEIGHTS Procedure ... E.4 SEARCHANDAUGMENT Procedure ... E.5 INCREASEWEIGHTS Procedure ... E.6 ACYCLIFY Procedure
Open Source Code Yes The code is available at https://github.com/pouyansh/Efficient_Partial_and_Robust_SDOT.
Open Datasets No We evaluated our implementation on a Gaussian distribution with mean [0.5, 0.5] and covariance 0.15I bounded within the unit square, and added 10% noise to the bottom-left corner by mixing with an exponential distribution with rate parameter 3. The discrete distribution consists of n samples drawn from the same Gaussian with an additional 10% noise sampled from an exponential distribution with rate parameter 3 in the top-right corner. ... In this experiment, we use mixtures of Beta distributions that induce skewed and dispersed mass.
Dataset Splits No The paper describes how the synthetic data is generated and used for experiments (e.g., 'The discrete distribution consists of n samples drawn from the same Gaussian...'), but it does not specify explicit training/test/validation splits for the data. The entire generated distribution seems to be used for each experimental run to evaluate the algorithm's performance on optimal transport problems.
Hardware Specification No Justification: the experimental sections does not include any resource specific results, such as running times.
Software Dependencies No We implemented a prototype of our algorithm in Python to empirically verify its performance. The implementation omits two efficiency-critical components: (i) the computation of continuous mass within regions defined by restricted Voronoi cells, which we approximate using a fine uniform grid over the domain by summing the mass of grid points lying inside each region, and (ii) the dynamic tree data structure, which we simulate by naïvely iterating over all edges due to the lack of efficient Python implementations. (No specific version numbers are provided for Python or any libraries.)
Experiment Setup Yes We fixed λ = 0.2 and ε = 0.02, and varied n from 10 to 80. For each value of n, we ran the program 10 times and reported the averaged statistics in Table 1. ... (Appendix B) We used a continuous distribution as a Gaussian distribution bounded inside the unit square that is contaminated with 10% noise in the bottom left corner... Our discrete support consists of 0.9n samples from the same Gaussian distribution... along with 0.1n samples from an exponential distribution... The ground distance is the squared Euclidean distance.