Explaining Hyperparameter Optimization via Partial Dependence Plots

Authors: Julia Moosbauer, Julia Herbinger, Giuseppe Casalicchio, Marius Lindauer, Bernd Bischl

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In an experimental study, we provide quantitative evidence for the increased quality of the PDPs within sub-regions.
Researcher Affiliation Academia Department of Statistics, Ludwig-Maximilians-University Munich, Munich, Germany Institute of Information Processing, Leibniz University Hannover, Hannover, Germany
Pseudocode Yes The pseudo-code to partition a hyperparameter (sub-)space and corresponding sample (λ(i)C )i2N 2 C, N {1, ..., n}, into two child regions is shown in Algorithm 1.
Open Source Code Yes The implementation of the proposed methods as well as reproducible scripts for the experimental analysis are provided in a public git-repository3. https://github.com/slds-lmu/paper_2021_xautoml
Open Datasets Yes LCBench data [Zimmer et al., 2021]. For each of the 35 different Open ML [Vanschoren et al., 2013] classification tasks, LCBench provides access to evaluations of a deep neural network on 2000 configurations randomly drawn from the configuration space defined by Auto-Py Torch Tabular (see Table 5 in Appendix C.2).
Dataset Splits No The paper states 'For each task, we trained a random forest as an empirical performance model that predicts the balanced validation error of the neural network for a given configuration' but does not provide specific details on the dataset split (e.g., percentages, sample counts, or methodology for creating the validation set) for their experiments.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU, memory, or specific computing cluster types) used to run the experiments.
Software Dependencies No The paper mentions several software components like 'Scikit-learn', 'mlrmbo', 'pdp', and 'Auto-Py Torch Tabular' and 'Python', but it does not specify their version numbers, which are necessary for reproducible software dependencies.
Experiment Setup Yes PDPs are computed with regards to single features for G = 20 equidistant grid points and n = 1000 Monte Carlo samples. We ran BO with a GP surrogate model with a Matérn-3/2 kernel and the LCB acquisition function a(λ) = ˆm(λ)+ ˆs(λ) with different values ∈ {0.1, 1, 5} to control the sampling bias. All computations were repeated 30 times. Each BO run was allotted a budget of 200 objective function evaluations.