Neural Networks for Predicting Algorithm Runtime Distributions

Authors: Katharina Eggensperger, Marius Lindauer, Frank Hutter

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In an empirical study involving five algorithms for SAT solving and AI planning, we show that neural networks predict the true RTDs of unseen instances better than previous methods, and can even do so when only few runtime observations are available per training instance.
Researcher Affiliation Academia Katharina Eggensperger, Marius Lindauer, Frank Hutter University of Freiburg {eggenspk, lindauer, fh}@cs.uni-freiburg.de
Pseudocode No The paper describes a pipeline in Figure 1 but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code and data can be obtained from here: http://www.ml4aad.org/distnet
Open Datasets Yes Code and data can be obtained from here: http://www.ml4aad.org/distnet. Scenario #instances #features cutoff [sec] Clasp-factoring2 2000 102 5000 Saps-CV-VAR2 10011 46 60 Spear-QCP2 8076 91 5000 Yal SAT-QCP2 11747 91 5000 Spear-SWGCP2 11182 76 5000 Yal SAT-SWGCP2 11182 76 5000 LPG-Zenotravel3 3999 165 300 Table 2: Characteristics of the used data sets.
Dataset Splits No Table 4 shows the NLLH achieved using a 10-fold cross-validation, i.e., we split the instances into 10 disjoint sets, train our models on all but one subset and measure the test performance on the left out subset. The paper specifies 10-fold cross-validation for training and testing, but does not explicitly mention a separate validation set split.
Hardware Specification Yes Run on a compute cluster with nodes equipped with two Intel Xeon E5-2630v4 and 128 GB memory running Cent OS 7. Run on a compute cluster with nodes equipped with two Intel Xeon E5-2650v2 and 64 GB memory running Ubuntu 14.04.
Software Dependencies No We used the open-source neural network library keras [Chollet et al., 2015] for our NN, scikit-learn [Pedregosa et al., 2011] for the RF implementation, and scipy [Jones et al., 2001] for fitting the distributions. The paper lists software libraries but does not provide specific version numbers for them.
Experiment Setup Yes Therefore we chose a fairly small NN with 2 hidden layers each with 16 neurons. used a fairly small batch size of 16 to reduce the correlation of the training data points in each batch. used a fairly small initial learning rate of 1e 3 exponentially decaying to 1e 5 and used gradient clipping [Pascanu et al., 2014] on top of it. Besides that, we used common architectural and parameterization choices: tanh as an activation function, SGD for training, batch normalization, and a L2-regularization of 1e 4.