Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Wasserstein Logistic Regression with Mixed Features
Authors: Aras Selvi, Mohammad Reza Belbasi, Martin Haugh, Wolfram Wiesemann
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our method outperforms both the unregularized and the regularized logistic regression on categorical as well as mixed-feature benchmark instances. (Abstract) and We report numerical results in Section 4. (Intro) and Figure 1: Left: Estimates of β for the standard logistic regression... All results are reported as averages over 2,000 statistically independent runs. (Section 2.1) and Figure 2: Runtime comparison between our column-and-constraint scheme and a naïve solution of problem (4) as a monolithic exponential conic program. (Section 4.1). |
| Researcher Affiliation | Academia | Aras Selvi Mohammad Reza Belbasi Martin B. Haugh Wolfram Wiesemann Imperial College Business School, Imperial College London, United Kingdom EMAIL |
| Pseudocode | Yes | Algorithm 1 Column-and-Constraint Generation Scheme for Problem (4). and Algorithm 2 Identification of Most Violated Constraints in the Reduced Problem (4). (Section 3) |
| Open Source Code | Yes | All source codes and detailed results are available on Git Hub (https://github.com/selvi-aras/Wasserstein LR). (Section 4) |
| Open Datasets | Yes | on the 14 most popular UCI data sets that only contain categorical features having more than 30 rows [11] (varying licenses). (Section 4.2) |
| Dataset Splits | Yes | All results are reported as means over 100 random training set-test set splits (80%:20%). and The radius ϵ {...} as well as the Lasso penalty γ {...} are selected via 5-fold crossvalidation. (Section 4.2) |
| Hardware Specification | Yes | All algorithms were implemented in Julia [5] (MIT license) and executed on Intel Xeon 2.66GHz processors with 8GB memory in single-core mode. (Section 4) |
| Software Dependencies | Yes | All algorithms were implemented in Julia [5] (MIT license)... We use MOSEK 9.3 [26] (commercial) to solve all exponential conic programs through Ju MP [12] (MPL2 License). (Section 4) |
| Experiment Setup | Yes | The radius ϵ {0, 10 5, . . . , 10 4, . . . , 1} of the Wasserstein ball as well as the Lasso penalty γ {0, 1/2 10 5, . . . , 1/2 10 4, . . . , 1/2} are selected via 5-fold crossvalidation. We consider two variants of our DR logistic regression that employ a different output label weight (κ = 1 vs. κ = m) in the ground metric (cf. Definition 2). (Section 4.2) |