Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Order Constraints in Optimal Transport
Authors: Yu Chin Fabian Lim, Laura Wynter, Shiau Hong Lim
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate experimentally that order constraints improve explainability using the e-SNLI (Stanford Natural Language Inference) dataset that includes human-annotated rationales as well as on several image color transfer examples. |
| Researcher Affiliation | Industry | IBM Research, Singapore. |
| Pseudocode | Yes | Algorithm 1 Iterative procedure for OT under OC Oij[k] with linear costs f(X) = tr DT X . ... Algorithm 2 e PAVA for C2 = Oij[k] for k [mn] ... Algorithm 3 Learning subtree b T (k1, k2, k3, τ1, τ2) of T (k3, τ1, τ2) and top-k2 candidate plans for linear costs f(Π) = tr DT Π . |
| Open Source Code | Yes | Optimal Transport with Order Constraints can be found in the AI Explainability 360 toolbox, which is part of the IBM Research Trusted AI library (Arya et al., 2019) at https://github.com/Trusted-AI/AIX360. |
| Open Datasets | Yes | We use an annotated dataset from the enhanced Stanford Natural Language Inference (e-SNLI) (Camburu et al., 2018; Swanson et al., 2020) ... We use source images from the SUN dataset (Xiao et al., 2010; Yu et al., 2016), and target images from Wiki Art (Tan et al., 2016). |
| Dataset Splits | Yes | We used sizes of (100K, 10K, 5K) for train, validation, and test, respectively. |
| Hardware Specification | Yes | Each computation run of Alg. 1 is measured on a single Intel x86 64 bit Xeon 2MHZ with 12GB memory per core. ... Classifier training was performed on a multi-core Ubuntu virtual machine and on a n Vidia Tesla P100-PCIE-16GB GPUs. |
| Software Dependencies | No | The paper mentions 'python-based algorithms', 'scipy.optimize', 'numpy', and 'C++-based cvxpy', but does not provide specific version numbers for these software components, which are crucial for reproducibility. |
| Experiment Setup | Yes | The thresholds that constrain T (k3, τ1, τ2) are set to (τ1, τ2) = (.5, .5). ... The iterations are set to terminate at 1e4 rounds or a max projection error of 1e-4, and these settings achieve an average functional approximation of 0.51% error (within .19). We use penalty ρ = 1.0. |