Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Authors: Omar Bennouna, Amine Bennouna, Saurabh Amin, Asuman Ozdaglar

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions offering a principled foundation for task-aware data selection. ... The NeurIPS checklist question 'Does the paper report error bars suitably and correctly defined or other appropriate information about the statistical significance of the experiments?' is answered 'No' with the justification 'There is no meaningful randomness in the experiments. Once the data is fixed, the output is deterministic. The generated data is for illustration only, and its randomness is irrelevant to the paper s claims.'
Researcher Affiliation	Academia	Omar Bennouna MIT EMAIL Amine Bennouna Northwestern University EMAIL Saurabh Amin MIT EMAIL Asuman Ozdaglar MIT EMAIL
Pseudocode	Yes	Algorithm 1 Meta-Algorithm Computing dir (X (C)) Input: Decision set X, Uncertainty set C. Output: A basis D Rd of dir (X (C)). Initialize D to . Set x0 arg minx X c 0 x for some c0 C. while there exists c C, x arg minx X c x such that x x0 span D. D D {x x0}. return D
Open Source Code	No	NeurIPS Paper Checklist: 5. Open access to data and code. Answer: [No]. Justification: There is no data used in the experiments. Experiments are a basic application of the algorithm for illustration purposes.
Open Datasets	No	The GPAs of candidates are generated using a uniform distribution in the interval [2, 4], and the level of experience is also uniform in {1, 2, 3, 4, 5}. ... NeurIPS Paper Checklist: 5. Open access to data and code. Answer: [No]. Justification: There is no data used in the experiments. Experiments are a basic application of the algorithm for illustration purposes.
Dataset Splits	No	The paper describes generating synthetic data for illustration purposes in Section 6. It does not perform typical experiments requiring training/test/validation splits.
Hardware Specification	No	NeurIPS Paper Checklist: 8. Experiments compute resources. Answer: [No]. Justification: It is irrelevant to the paper s results.
Software Dependencies	No	The MIP of Algorithm 2 is solved using Gurobi.
Experiment Setup	No	The paper primarily presents theoretical characterizations and an algorithm. The 'Application: Hiring Interviews' section (Section 6) describes problem parameters like 'η controls the misspecification level' and how candidate features are generated (uniform distribution for GPA and experience levels), but does not detail experimental setup in terms of hyperparameters, training procedures, or specific solver settings beyond mentioning 'Gurobi' for the MILP.