Bias in Algorithm Portfolio Performance Evaluation

Authors: Chris Cameron, Holger H. Hoos, Kevin Leyton-Brown

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report results from an empirical study using solvers and instances submitted to several SAT competitions, in which we observe significant bias on many random instances and some combinatorial instances.
Researcher Affiliation Academia Chris Cameron, Holger H. Hoos, Kevin Leyton-Brown University of British Columbia, 201-2366 Main Mall, Vancouver, BC, CANADA
Pseudocode No The paper describes mathematical derivations and experimental procedures in text but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any information or links regarding the availability of its source code.
Open Datasets Yes We began by gathering all runnable solvers submitted to SAT competitions from 2007 to 2014 in the Hard-combinatorial SAT+UNSAT, Application SAT+UNSAT, and Random SAT+UNSAT tracks. [...] We ran the 21 randomized solvers from Random SAT+UNSAT on 50 randomly sampled instances from the Random SAT+UNSAT track and the 12 randomized solvers from the Hard-combinatorial SAT+UNSAT on all 44 sgen instances from the Hard-combinatorial SAT+UNSAT track.
Dataset Splits No The paper discusses bootstrap sampling and selection of instances, but it does not specify explicit training, validation, and test dataset splits in the conventional sense used for model evaluation.
Hardware Specification Yes All runs were performed on the 544-node Westgrid cluster Orcinus (each of whose nodes is equipped with two 6-core, 2.66 GHz Intel Xeon X5650 CPUs and 24GB memory, running Red Hat Enterprise Linux Server 5.3).
Software Dependencies No The paper mentions 'Red Hat Enterprise Linux Server 5.3' for the operating system but does not provide specific version numbers for any other software dependencies, libraries, or tools used in the experiments.
Experiment Setup Yes Every solver was given a cutoff time of 5000 CPU seconds to solve each instance; [...] We ran each solver 50 times on every instance with distinct pseudorandom number seeds and with a per-run cutoff time of 5000 CPU seconds. [...] We elected to allocate different amounts of memory for different solvers, depending on their memory requirements. We profiled each solver s memory usage over all instances and allocated to each solver its maximum RAM usage plus a small buffer: between 1GB and 3GB in total, depending on the solver.