Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Procrastinating with Confidence: Near-Optimal, Anytime, Adaptive Algorithm Configuration
Authors: Robert Kleinberg, Kevin Leyton-Brown, Brendan Lucier, Devon Graham
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically both that such settings arise frequently in practice and that the anytime property is useful for finding good configurations quickly. 5 Experimental Results We experiment with SPC on the benchmark set of runtimes generated by Weisz et al. (2018b) for testing LEAPSANDBOUNDS. |
| Researcher Affiliation | Collaboration | Robert Kleinberg Department of Computer Science Cornell University EMAIL Kevin Leyton-Brown Department of Computer Science University of British Columbia EMAIL Brendan Lucier Microsoft Research EMAIL Devon Graham Department of Computer Science University of British Columbia EMAIL |
| Pseudocode | Yes | Algorithm 1: Structured Procrastination w/ Confidence |
| Open Source Code | Yes | 3Code to reproduce experiments is available at https://github.com/drgrhm/alg_config |
| Open Datasets | Yes | We experiment with SPC on the benchmark set of runtimes generated by Weisz et al. (2018b) for testing LEAPSANDBOUNDS. This data consists of pre-computed runtimes for 972 configurations of the minisat (Sorensson & Een, 2005) SAT solver on 20118 SAT instances generated using CNFuzz DD4.4http://fmv.jku.at/cnfuzzdd/ |
| Dataset Splits | No | The paper uses a benchmark set of pre-computed runtimes but does not specify any explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper mentions 'CPU time in days' for experimental runtime but does not provide specific hardware details such as CPU/GPU models, memory, or other system specifications. |
| Software Dependencies | No | The paper mentions the 'minisat' SAT solver used to generate the dataset but does not list specific software dependencies with version numbers required to replicate the experiments. |
| Experiment Setup | No | The paper describes the benchmark data and comparisons made, but does not provide specific hyperparameters or system-level training settings for SPC within its experimental setup. |