Explaining Hyperparameter Optimization via Partial Dependence Plots
Authors: Julia Moosbauer, Julia Herbinger, Giuseppe Casalicchio, Marius Lindauer, Bernd Bischl
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In an experimental study, we provide quantitative evidence for the increased quality of the PDPs within sub-regions. |
| Researcher Affiliation | Academia | Department of Statistics, Ludwig-Maximilians-University Munich, Munich, Germany Institute of Information Processing, Leibniz University Hannover, Hannover, Germany |
| Pseudocode | Yes | The pseudo-code to partition a hyperparameter (sub-)space and corresponding sample (λ(i)C )i2N 2 C, N {1, ..., n}, into two child regions is shown in Algorithm 1. |
| Open Source Code | Yes | The implementation of the proposed methods as well as reproducible scripts for the experimental analysis are provided in a public git-repository3. https://github.com/slds-lmu/paper_2021_xautoml |
| Open Datasets | Yes | LCBench data [Zimmer et al., 2021]. For each of the 35 different Open ML [Vanschoren et al., 2013] classification tasks, LCBench provides access to evaluations of a deep neural network on 2000 configurations randomly drawn from the configuration space defined by Auto-Py Torch Tabular (see Table 5 in Appendix C.2). |
| Dataset Splits | No | The paper states 'For each task, we trained a random forest as an empirical performance model that predicts the balanced validation error of the neural network for a given configuration' but does not provide specific details on the dataset split (e.g., percentages, sample counts, or methodology for creating the validation set) for their experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU, memory, or specific computing cluster types) used to run the experiments. |
| Software Dependencies | No | The paper mentions several software components like 'Scikit-learn', 'mlrmbo', 'pdp', and 'Auto-Py Torch Tabular' and 'Python', but it does not specify their version numbers, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | PDPs are computed with regards to single features for G = 20 equidistant grid points and n = 1000 Monte Carlo samples. We ran BO with a GP surrogate model with a Matérn-3/2 kernel and the LCB acquisition function a(λ) = ˆm(λ)+ ˆs(λ) with different values ∈ {0.1, 1, 5} to control the sampling bias. All computations were repeated 30 times. Each BO run was allotted a budget of 200 objective function evaluations. |