reproducibilityindex.ai

On Performance Estimation in Automatic Algorithm Configuration

Authors: Shengcai Liu, Ke Tang, Yunwei Lei, Xin Yao2384-2391

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper addresses this gap by studying the performance estimation problem in AAC. More speciﬁcally, this paper ﬁrst proves the universal best performance estimator in a practical setting, and then establishes theoretical bounds on the estimation error, i.e., the difference between the training performance and the true performance for a parameter conﬁguration, considering ﬁnite and inﬁnite conﬁguration spaces respectively. These ﬁndings were veriﬁed in extensive experiments conducted on four algorithm conﬁguration scenarios involving different problem domains.
Researcher Affiliation	Academia	1School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China 2Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
Pseudocode	No	The paper describes algorithms and estimation methods but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code and the complete experiment results are available at https://github.com/EEAAC/ac_estimation_error
Open Datasets	Yes	We selected two scenarios SATenstein QCP and clasp-weighted-sequence from the Algorithm Conﬁguration Library (AClib) (Hutter et al. 2014) and built two new scenarios LKH-uniform-400/1000. For each scenario, we gathered a M P 5 matrix containing the performances of M conﬁgurations on P instances, with each conﬁguration running on each instance for 5 times. Let ΘM be the set of the M conﬁgurations and ZP be the set of the P instances. In the experiments, when acquiring the performance of a conﬁguration θ on an instance, instead of actually running θ, the value stored in the corresponding entry of the matrix was used.
Dataset Splits	No	For convenience henceforth we will use split P1\|P2 to denote that we subsequently randomly select, without replacement, P1 and P2 instances from ZP as training instances and test instances respectively. For a given θ, we always used the performance obtained by an estimator on the training instances as its training performance, and used its performance on the test instances as its true performance.
Hardware Specification	Yes	All the experiments were conducted on a Xeon machines with 128 GB RAM and 24 cores each (2.20 GHz, 30 MB Cache), running Cent OS.
Software Dependencies	No	The paper mentions that the code was implemented based on AClib (Hutter et al. 2014), but it does not specify version numbers for AClib or any other software dependencies.
Experiment Setup	Yes	We set K = r1P and N = r2K, and ranged r1 from 0.1 to 0.5 with a step of 0.05, r2 from 0.25 to 4.0 with a step of 0.25. To reduce the variations of our experiments, for each combination of r1 and r2, we split K\|P/2 for 2500 times, and on each split, we obtained the estimation error of an estimator on each θ Θ, and then calculated the mean value, which was further averaged over all splits.