Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Accelerating ERM for data-driven algorithm design using output-sensitive techniques
Authors: Maria-Florina F. Balcan, Christopher Seiler, Dravyansh Sharma
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We design novel approaches that use tools from computational geometry and lead to output-sensitive algorithms for learning good parameters by implementing the ERM (Empirical Risk Minimization) for several distinct data-driven design problems. The resulting learning algorithms scale polynomially with the number of sum dual class function pieces RĪ£ in the worst case (See Table 1) and are efficient for small constant d. |
| Researcher Affiliation | Academia | Carnegie Mellon University, EMAIL. Work done by Christopher Seiler while he was at CMU. Corresponding author: EMAIL. Work done by Dravyansh Sharma while he was at CMU. |
| Pseudocode | Yes | Algorithm 1: OUTPUTSENSITIVEPARTITIONSEARCH |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing its source code or a direct link to a code repository. |
| Open Datasets | No | The paper discusses 'problem instances' and 'problem samples' but does not provide concrete access information (link, DOI, repository, or formal citation) for a specific publicly available or open dataset used in its analysis. |
| Dataset Splits | No | The paper does not specify exact train/validation/test split percentages, absolute sample counts, or reference predefined splits for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments. |
| Software Dependencies | No | The paper does not list specific software components with version numbers (e.g., Python 3.8, PyTorch 1.9) or self-contained solvers (e.g., CPLEX 12.4) needed to replicate the work. |
| Experiment Setup | No | The paper does not provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings. |