Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Projection Robust Wasserstein Barycenters
Authors: Minhui Huang, Shiqian Ma, Lifeng Lai
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We incorporate the RPRWB into a discrete distribution clustering algorithm, and the numerical results on real text datasets confirm that our RPRWB model helps improve the clustering performance significantly. |
| Researcher Affiliation | Academia | 1 Department of Electrical and Computer Engineering, University of California, Davis, CA, USA 2 Department of Mathematics, University of California, Davis, CA, USA. |
| Pseudocode | Yes | Algorithm 1 The RGA-IBP Algorithm; Algorithm 2 The RBCD Algorithm; Algorithm 3 Round(π, p, q) |
| Open Source Code | Yes | Code available at https://github.com/mhhuang95/PRWB. |
| Open Datasets | Yes | We use the pre-trained word-vector dataset Glo Ve (Pennington et al., 2014)... The Reuters Subset is a 5-class subset of the Reuters dataset 2. The BBCnews Abstract and BBCsport Abstract 3 (Greene & Cunningham, 2006)... |
| Dataset Splits | No | The paper describes using datasets for clustering performance evaluation and mentions AMI scores, but does not explicitly detail a validation dataset split or usage. |
| Hardware Specification | Yes | All experiments are conducted on a Linux server with a 32-core Intel Xeon CPU (E5-2667, v4, 3.20GHz per core). |
| Software Dependencies | No | The paper mentions using Glo Ve 300d word vectors but does not specify software dependencies with version numbers for the implementation of the proposed algorithms or other experimental tools. |
| Experiment Setup | Yes | We set the step size τ = 0.0005 for both RBCD and RGA-IBP algorithms and η = 0.5 mid({Cl}l [m])... We set ϵ = 10 4... We choose k = 2 for the BBCsport Abstract dataset and k = 3 for the Reuters Subset and the BBCnews Abstract datasets. |