HyperJump: Accelerating HyperBand via Risk Modelling
Authors: Pedro Mendes, Maria Casimiro, Paolo Romano, David Garlan
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Hyper Jump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups, both in sequential and parallel deployments, on a variety of deep-learning, kernel-based learning and neural architectural search problems when compared to Hyper Band and to several state-of-the-art optimizers. |
| Researcher Affiliation | Academia | Pedro Mendes1,2, Maria Casimiro1,2, Paolo Romano1, David Garlan2 1 INESC-ID and Instituto Superior T ecnico, Universidade de Lisboa 2 Software and Societal Systems Department, Carnegie Mellon University {pgmendes, mdaloura, dg4d}@andrew.cmu.edu, romano@inesc-id.pt |
| Pseudocode | Yes | Algorithm 1: Pseudo-code for a HJ bracket consisting of S stages, with budget b for the initial stage. [...] Algorithm 2: Pseudo-code of the logic used to determine the sets of configurations to consider when jumping from stage s to stage s + 1 (function GET CANDIDATES FOR S()) [...] Algorithm 3: Pseudo-code for the EVALUATE JUMP RISK function. |
| Open Source Code | Yes | We have made available the implementation of HJ and the benchmarks used 2. 2https://github.com/pedrogbmendes/HyperJump |
| Open Datasets | Yes | Our first benchmark, NATS-Bench (Dong et al. 2021)... We also use LIBSVM (Chang and Lin 2011) on the Covertype data set (Dua and Graff 2017)... |
| Dataset Splits | No | The paper mentions using NATS-Bench and LIBSVM datasets but does not provide specific details on the training/validation/test splits (e.g., percentages, sample counts, or methodology for splitting). |
| Hardware Specification | No | The paper discusses parallel deployments using '32 workers' but does not specify the type of hardware (e.g., GPU/CPU models, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions software frameworks like Ray Tune and implementations of BOHB and HB, but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | We use the default parameters of BOHB and Fabolas. Similarly to HB, we set the parameter η to 3, and for fairness, when comparing HJ, HB, BOHB, and ASHA, we configure them to use the same η value. We use the default value of 10% for the threshold λ for HJ and include in the supplemental material a study on the sensitivity to the tuning of λ. |