Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees
Authors: Avinandan Bose, Arunesh Sinha, Tien Mai
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally compare our different approaches and baselines, and reveal nuanced properties of a DRO solution. Our final contribution is detailed experiments validating the scalability of our approaches on a simulated security game problem as well as two variants of facility location using park and ride data-sets from New York [Holguin-Veras et al., 2012]. |
| Researcher Affiliation | Academia | Avinandan Bose University of Washington EMAIL Arunesh Sinha Rutgers University EMAIL Tien Mai Singapore Management University EMAIL |
| Pseudocode | No | The paper contains mathematical formulations and transformations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The ethics checklist states 'Yes' to including code, data, and instructions, but the paper itself does not provide a specific URL or clear statement of where the code for the methodology can be accessed (e.g., in supplementary material or a public repository). |
| Open Datasets | Yes | P&R-NYC Dataset : We use a large and challenging Park-and-ride (P&R) dataset collected in New York City, which provides utilities for 82341 clients (N) for 59 park and ride locations (M), along with their incumbent utilities for competing facilities [Holguin-Veras et al., 2012]; this data was directly used for MC-FLP. |
| Dataset Splits | No | The paper states 'We split the data (randomly) into training and test (80:20)' but does not explicitly mention a validation set or its split percentage. |
| Hardware Specification | Yes | We use a 2.1 GHz CPU with 128GB RAM. |
| Software Dependencies | No | The paper mentions 'CPLEX' as a solver but does not provide a specific version number for it or any other software dependencies. |
| Experiment Setup | Yes | We fix K = 10 in approximation via discretization as we find that objective increase saturates for this K (see Appendix K). The numbers reported for our baselines are the best values over 10 random initializations. We use our clustering approach with 50 clusters. |