Formalizing Preferences Over Runtime Distributions
Authors: Devon R. Graham, Kevin Leyton-Brown, Tim Roughgarden
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper aims to lay theoretical foundations for such choices by formalizing preferences over runtime distributions. ... Finally, in Section 5 we present some real-world examples where the choice of utility function really is important and changes our conclusions about which algorithm is considered best." and later in Section 5: "Algorithm Configuration. We considered a dataset due to Weisz et al. (2018) which evaluated 972 randomly-sampled configurations of the minisat (Sorensson & Een, 2005) SAT solver... Our results (Figure 3) show that these differences were significant in practice: we often lost a substantial fraction of the available utility when we optimized for the wrong utility function. International SAT Competition. Figure 4 shows the ranking of the Parallel Track of the 2021 International SAT Competition. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of British Columbia, Vancouver, BC 2Department of Computer Science, Columbia University, New York, New York 3a16z crypto. Correspondence to: Devon R. Graham <drgraham@cs.ubc.ca>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to reproduce all figures can be found at https://github.com/drgrhm/formalizing-preferences |
| Open Datasets | Yes | We considered a dataset due to Weisz et al. (2018) which evaluated 972 randomly-sampled configurations of the minisat (Sorensson & Een, 2005) SAT solver on 20118 instances generated by CNFuzz DD. |
| Dataset Splits | No | The paper mentions evaluating configurations on '20118 instances generated by CNFuzz DD' but does not specify any training, validation, or test splits for these instances. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper mentions the 'minisat SAT solver' and 'CNFuzz DD' but does not provide specific version numbers for these or any other software dependencies used in the experiments. |
| Experiment Setup | No | The paper mentions evaluating 'randomly-sampled configurations' and analyzing results from the SAT Competition, but it does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings used for its own analysis. |