Formalizing Preferences Over Runtime Distributions

Authors: Devon R. Graham, Kevin Leyton-Brown, Tim Roughgarden

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper aims to lay theoretical foundations for such choices by formalizing preferences over runtime distributions. ... Finally, in Section 5 we present some real-world examples where the choice of utility function really is important and changes our conclusions about which algorithm is considered best." and later in Section 5: "Algorithm Configuration. We considered a dataset due to Weisz et al. (2018) which evaluated 972 randomly-sampled configurations of the minisat (Sorensson & Een, 2005) SAT solver... Our results (Figure 3) show that these differences were significant in practice: we often lost a substantial fraction of the available utility when we optimized for the wrong utility function. International SAT Competition. Figure 4 shows the ranking of the Parallel Track of the 2021 International SAT Competition.
Researcher Affiliation Collaboration 1Department of Computer Science, University of British Columbia, Vancouver, BC 2Department of Computer Science, Columbia University, New York, New York 3a16z crypto. Correspondence to: Devon R. Graham <drgraham@cs.ubc.ca>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code to reproduce all figures can be found at https://github.com/drgrhm/formalizing-preferences
Open Datasets Yes We considered a dataset due to Weisz et al. (2018) which evaluated 972 randomly-sampled configurations of the minisat (Sorensson & Een, 2005) SAT solver on 20118 instances generated by CNFuzz DD.
Dataset Splits No The paper mentions evaluating configurations on '20118 instances generated by CNFuzz DD' but does not specify any training, validation, or test splits for these instances.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions the 'minisat SAT solver' and 'CNFuzz DD' but does not provide specific version numbers for these or any other software dependencies used in the experiments.
Experiment Setup No The paper mentions evaluating 'randomly-sampled configurations' and analyzing results from the SAT Competition, but it does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings used for its own analysis.