Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization
Authors: Omar Bennouna, Amine Bennouna, Saurabh Amin, Asuman Ozdaglar
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions offering a principled foundation for task-aware data selection. ... The NeurIPS checklist question 'Does the paper report error bars suitably and correctly defined or other appropriate information about the statistical significance of the experiments?' is answered 'No' with the justification 'There is no meaningful randomness in the experiments. Once the data is fixed, the output is deterministic. The generated data is for illustration only, and its randomness is irrelevant to the paper s claims.' |
| Researcher Affiliation | Academia | Omar Bennouna MIT EMAIL Amine Bennouna Northwestern University EMAIL Saurabh Amin MIT EMAIL Asuman Ozdaglar MIT EMAIL |
| Pseudocode | Yes | Algorithm 1 Meta-Algorithm Computing dir (X (C)) Input: Decision set X, Uncertainty set C. Output: A basis D Rd of dir (X (C)). Initialize D to . Set x0 arg minx X c 0 x for some c0 C. while there exists c C, x arg minx X c x such that x x0 span D. D D {x x0}. return D |
| Open Source Code | No | NeurIPS Paper Checklist: 5. Open access to data and code. Answer: [No]. Justification: There is no data used in the experiments. Experiments are a basic application of the algorithm for illustration purposes. |
| Open Datasets | No | The GPAs of candidates are generated using a uniform distribution in the interval [2, 4], and the level of experience is also uniform in {1, 2, 3, 4, 5}. ... NeurIPS Paper Checklist: 5. Open access to data and code. Answer: [No]. Justification: There is no data used in the experiments. Experiments are a basic application of the algorithm for illustration purposes. |
| Dataset Splits | No | The paper describes generating synthetic data for illustration purposes in Section 6. It does not perform typical experiments requiring training/test/validation splits. |
| Hardware Specification | No | NeurIPS Paper Checklist: 8. Experiments compute resources. Answer: [No]. Justification: It is irrelevant to the paper s results. |
| Software Dependencies | No | The MIP of Algorithm 2 is solved using Gurobi. |
| Experiment Setup | No | The paper primarily presents theoretical characterizations and an algorithm. The 'Application: Hiring Interviews' section (Section 6) describes problem parameters like 'η controls the misspecification level' and how candidate features are generated (uniform distribution for GPA and experience levels), but does not detail experimental setup in terms of hyperparameters, training procedures, or specific solver settings beyond mentioning 'Gurobi' for the MILP. |